BookmarkSubscribeRSS Feed
CynthiaWei
Obsidian | Level 7

Hi SAS Pros,

 

I am having a dataset that is sorted by ID and date with the following variables:

 

data have;

input group date $ age gender ;

cards;

1   1/1/2017   3   0
1   1/2/2017   1   1
1   1/3/2017   10  1
1   1/9/2017   5   0
1   1/11/2017  11  1
2   1/1/2017   12  0
2   1/3/2017   10  1
2   1/7/2017   9   1
2   1/11/2017  16  0
2   1/20/2017  15  1
;
run;

What I want to do is to create a new variable called Enroll_Yes that lets the first student (sorted by date for each group) who is greater or equal to 10 years old boy (gender=1) have a value of 1 (this student is enrolled, which is the 3rd observation of group 1) and the following students as long as they are greater or equal to 10 years old regardless whether they are boys or girls (gender= 0 or 1 are both fine) also have a value of 1 (the 5th observation of group 1). For group 2, the students who should be enrolled are 2nd, 4th, and 5th. The first observation should NOT be enrolled because the first enrolled one for each group has to be >=10 yrs BOY. 

 

What I want is listed as follows:

group date $ age gender 
1   1/1/2017   3   0   0
1   1/2/2017   1   1   0
1   1/3/2017   10  1   1
1   1/9/2017   5   0   0
1   1/11/2017  11  0   1
2   1/1/2017   12  0   0
2   1/3/2017   10  1   1
2   1/7/2017   9   1   0
2   1/11/2017  16  0   1
2   1/20/2017  15  1   1

Thank you so much for any help in advance!

 

Best regards,

 

C

 

4 REPLIES 4
CynthiaWei
Obsidian | Level 7

I am sorry the Want dataset should be like this: (I forget to write the variable wanted---Enroll_Yes in the Want dataset)

group date $ age gender Enroll_Yes
1   1/1/2017   3   0   0
1   1/2/2017   1   1   0
1   1/3/2017   10  1   1
1   1/9/2017   5   0   0
1   1/11/2017  11  0   1
2   1/1/2017   12  0   0
2   1/3/2017   10  1   1
2   1/7/2017   9   1   0
2   1/11/2017  16  0   1
2   1/20/2017  15  1   1

Thank you!

novinosrin
Tourmaline | Level 20


data have;

input group date $ age gender ;

cards;
1   1/1/2017   3   0
1   1/2/2017   1   1
1   1/3/2017   10  1
1   1/9/2017   5   0
1   1/11/2017  11  1
2   1/1/2017   12  0
2   1/3/2017   10  1
2   1/7/2017   9   1
2   1/11/2017  16  0
2   1/20/2017  15  1
;
run;

data want;
 do until(last.group);
   set have;
   by group;
   if age>=10 and gender then _n_=0;
   Enroll_Yes=_n_=0 & age>=10;
   output;
 end;
run;

Btw, thank you for posting a clean workable sample data. Cheers!

PeterClemmensen
Tourmaline | Level 20
data want;
   set have;
   by group;
   if first.group then _iorc_ = 0;
   if age ge 10 and gender = 1 then _iorc_ = 1;
   Enroll_Yes = (_iorc_ = 1 and age ge 10);
run;

Result:

 

group date      age  gender Enroll_Yes 
1     1/1/2017  3    0      0 
1     1/2/2017  1    1      0 
1     1/3/2017  10   1      1 
1     1/9/2017  5    0      0 
1     1/11/201  11   1      1 
2     1/1/2017  12   0      0 
2     1/3/2017  10   1      1 
2     1/7/2017  9    1      0 
2     1/11/201  16   0      1 
2     1/20/201  15   1      1 
novinosrin
Tourmaline | Level 20

Hi @CynthiaWei  Good morning. Your problem has some similarities to one of our production code at our bank that flags "debit" and "credit" transactions, however majority of the folks here are SQL users and their comfort lies with Proc SQL.

Your sample date is char format, however I'm assuming the real one is indeed a proper numeric SAS date. If not, it's trivial as it merely requires minor conversion to numeric sas date in the conditional execution. 

In Proc SQL, you can use summary functions at best and perhaps push the query into the database which the folks here are good at-

 




data have;

input group date :mmddyy10. age gender ;
format date mmddyy10.;

cards;
1   1/1/2017   3   0
1   1/2/2017   1   1
1   1/3/2017   10  1
1   1/9/2017   5   0
1   1/11/2017  11  1
2   1/1/2017   12  0
2   1/3/2017   10  1
2   1/7/2017   9   1
2   1/11/2017  16  0
2   1/20/2017  15  1
;
run;

proc sql;
 create table want as
 select *,age>=10 and date>=min(case when gender and age>=10 then date else . end) as Enroll_Yes 
 from have
 group by group
 order by group, date;
quit;

Note: You could use IFN instead of CASE WHEN albeit you pay a performance penalty. On the other hand, CASE WHEN is convenient and portable as SQL engine takes care of it. 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 685 views
  • 0 likes
  • 3 in conversation