BookmarkSubscribeRSS Feed
Kurt_Bremser
Super User

@Pandu2 wrote:
Data want;
Set have;
by id roll num;
If (first.id and first.roll and first.num)=1 and sas_date >=old_date
Then output;
run;
I sorted the dataset by using those 3 variables and using first keyword to get the 1st variable which should meet the condition sas_date>=old_date but it isn't working instead it is giving me a blank data

FIRST. does not indicate variables, it indicates observations. Please be very clear with your choice of words, clarity is the lynchpin of working with software.

Your condition

first.id and first.roll and first.num

is not needed in this form; since it can only be true when first.id is true, and first.id implies first. for all other variables, you only need the first part fo the condition.

The fact that you got no output means that the condition

sas_date >=old_date

is never true for any first observation in a ID group.

If you want to retrieve the first observation of a ID group that fulfills the date condition, you need to do this:

data want;
set have (where=(sas_date >= old_date));
by id roll num; /* roll and num are only here to ensure that the dataset is properly sorted */
if first.id;
run;

Since only observations meeting the date condition make it into the data step, the subsequent FIRST. gets what you want.

 

If this is not what you want, then you MUST POST EXAMPLE DATA AND THE EXPECTED RESULT.

Pandu2
Obsidian | Level 7
data WORK.have;
input DATE1 :DATE9. DATE2 :DATE9. ID:4. ROLL:2. NUM:3. AMOUNT:2.;
datalines;
01JAN2022 20APR2022 1411 69 115 1
01FEB2022 30JAN2022 1411 69 115 2
01FEB2022 30JAN2022 1411 69 115 3
03MAR2022 21FEB2022 1412 70 116 1
04MAR2022 29JAN2022 1412 70 116 2
;;;;

Here's the data step code of the dataset. The records were sorted based on date1, DATE2, id, roll, num and the primary keys for this dataset is id, roll, num. Based on these 3 variables the data is inter linked with eachother.

When comes to what I require is
I require the rows of 1st appearance of Id, roll, num only when the DATE1 is >= DATE2. This DATE1 >= DATE2 should apply only when the 1st appearance of Id, roll, num. If this fails, please keep an OR condition by saying the amount values should be compared to prior and next value of eachother and when there's an increase in the amount variable values then the row which has max amount value should be picked and it must meet the base condition which is DATE1 >= DATE2.
Example :
01JAN2022 20APR2022 1411 69 115 1
01FEB2022 30JAN2022 1411 69 115 2
01FEB2022 30JAN2022 1411 69 115 3
03MAR2022 21FEB2022 1412 70 116 1
04MAR2022 29JAN2022 1412 70 116 2

In the given example above, the DATE1 value is not >= DATE2 for the 1st appearance of Id, roll, num so, it fails then for the next two values the amount is increased i.e 1,2,3 so, for this case the max amount value which is 3 that row should be picked and also the base condition is met 01FEB2022 >= 30JAN2022. This is one iteration. For the 2nd iteration i.e as the id, roll, num is changed that's why I said it as 2nd iteration. So, in the 2nd iteration the 1st appearance of Id, roll, num(i.e 1412 70 116) met the basic condition i.e 03MAR2022 >= 21FEB2022 this row should be picked. once this condition is pass, it shouldn't go to OR condition.

I hope this makes sense to y'all. Thankyou.
Pandu2
Obsidian | Level 7
I forgot to add something previously.
Sometimes the num and roll values will change and the rest ( id) remains same.
For example :
1411 69 115
1411 70 116
1411 71 117
.
.
. Likewise so on.
Kurt_Bremser
Super User

@Pandu2 wrote:
I forgot to add something previously.
Sometimes the num and roll values will change and the rest ( id) remains same.
For example :
1411 69 115
1411 70 116
1411 71 117
.
.
. Likewise so on.

And what shall we do in this case? Consider each id/roll/num combination a single group, or each id a group? If the latter, then we do not need to consider roll and num for groups anyway. But are they important for a certain order, or should the order within a group be based on one of the dates?

Pandu2
Obsidian | Level 7
In that case, we should consider each id, roll, num combination a single group.
Kurt_Bremser
Super User

@Pandu2 wrote:
In that case, we should consider each id, roll, num combination a single group.
Then we must change the BY and the FIRST. and LAST.:
data WORK.have;
input DATE1 :DATE9. DATE2 :DATE9. ID:4. ROLL:2. NUM:3. AMOUNT:2.;
format date1 date2 yymmdd10.;
datalines;
01JAN2022 20APR2022 1411 69 115 1
01FEB2022 30JAN2022 1411 69 115 2
01FEB2022 30JAN2022 1411 69 115 3
03MAR2022 21FEB2022 1412 70 116 1
04MAR2022 29JAN2022 1412 70 116 2
;

data want;
set have;
by id roll num;
retain _flag _date1 _date2 _roll _num _amount;
if first.num
then do;
  _flag = 1;
  _amount = .;
  if date1 >= date2
  then do;
    _flag = 0;
    output;
  end;
end;
if amount > _amount
then do;
  _date1 = date1;
  _date2 = date2;
  _roll = roll;
  _num = num;
  _amount = amount;
end;
if last.num and _flag
then do;
  date1 = _date1;
  date2 = _date2;
  roll = _roll;
  num = _num;
  amount = _amount;
  output;
end;
drop _:;
run;
Pandu2
Obsidian | Level 7
Have you tried this code with my data. Please let me know the output. Thankyou.
Quentin
Super User

As a matter of courtesy and also to make this a useful learning experience, after being provided with code in an answer, it's probably a better idea for you to run the code on your data and see if works, and if you can understand how it works, rather than to ask someone else to run the code on your data for you and send you the results.

Pandu2
Obsidian | Level 7
I apologise for that.
Pandu2
Obsidian | Level 7
This gave me an error which is By variables are not sorted properly even though I used proc sort before this code. Please advise.
Quentin
Super User
Double check that the PROC SORT step uses the exact same BY statement as your DATA step, and that your are sorting the data set read in by the DATA step.
Pandu2
Obsidian | Level 7
This got worked somehow but this isn't taking 1st appearance of Id, roll, num instead it is taking values only based on the base condition which is DATE1 >= DATE2.

The output which I got was :

01FEB2022 30JAN2022 1411 69 115 2
01FEB2022 30JAN2022 1411 69 115 3
03MAR2022 21FEB2022 1412 70 116 1
04MAR2022 29JAN2022 1412 70 116 2

It shouldn't be like this. The required output is :
01FEB2022 30JAN2022 1411 69 115 2
03MAR2022 21FEB2022 1412 70 116 1.
Kurt_Bremser
Super User

Can't be:

data WORK.have;
input DATE1 :DATE9. DATE2 :DATE9. ID:4. ROLL:2. NUM:3. AMOUNT:2.;
format date1 date2 yymmdd10.;
datalines;
01JAN2022 20APR2022 1411 69 115 1
01FEB2022 30JAN2022 1411 69 115 2
01FEB2022 30JAN2022 1411 69 115 3
03MAR2022 21FEB2022 1412 70 116 1
04MAR2022 29JAN2022 1412 70 116 2
;

data want;
set have;
by id roll num;
retain _flag _date1 _date2 _roll _num _amount;
if first.num
then do;
  _flag = 1;
  _amount = .;
  if date1 >= date2
  then do;
    _flag = 0;
    output;
  end;
end;
if amount > _amount
then do;
  _date1 = date1;
  _date2 = date2;
  _roll = roll;
  _num = num;
  _amount = amount;
end;
if last.num and _flag
then do;
  date1 = _date1;
  date2 = _date2;
  roll = _roll;
  num = _num;
  amount = _amount;
  output;
end;
drop _:;
run;

proc print data=want noobs;
run;

Result:

DATE1	DATE2	ID	ROLL	NUM	AMOUNT
2022-02-01	2022-01-30	1411	69	115	3
2022-03-03	2022-02-21	1412	70	116	1
Pandu2
Obsidian | Level 7
This code shouldn't have if last.num cuz I need only 1st appearance of the id, roll, num. And also a small change in the 2nd case where I said if the 1st case fails, which means if the 1st appearance of Id, roll, num doesn't meet the condition DATE1 >= DATE2 then the increase in the amount field should be checked with prior and next value and if there's increase instead of taking max amount we must take the immediate increase row values.
Required Output:
01FEB2022 30JAN2022 1411 69 115 2
Previously it takes the row values which has amount value i e "3". Instead it should take immediate increase in amount field i.e "2".

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 67 replies
  • 2427 views
  • 1 like
  • 6 in conversation