Hi SAS Pros,
I am having a dataset that is sorted by ID and date with the following variables:
data have;
input group date $ age gender ;
cards;
1 1/1/2017 3 0
1 1/2/2017 1 1
1 1/3/2017 10 1
1 1/9/2017 5 0
1 1/11/2017 11 1
2 1/1/2017 12 0
2 1/3/2017 10 1
2 1/7/2017 9 1
2 1/11/2017 16 0
2 1/20/2017 15 1
;
run;
What I want to do is to create a new variable called Enroll_Yes that lets the first student (sorted by date for each group) who is greater or equal to 10 years old boy (gender=1) have a value of 1 (this student is enrolled, which is the 3rd observation of group 1) and the following students as long as they are greater or equal to 10 years old regardless whether they are boys or girls (gender= 0 or 1 are both fine) also have a value of 1 (the 5th observation of group 1). For group 2, the students who should be enrolled are 2nd, 4th, and 5th. The first observation should NOT be enrolled because the first enrolled one for each group has to be >=10 yrs BOY.
What I want is listed as follows:
group date $ age gender
1 1/1/2017 3 0 0
1 1/2/2017 1 1 0
1 1/3/2017 10 1 1
1 1/9/2017 5 0 0
1 1/11/2017 11 0 1
2 1/1/2017 12 0 0
2 1/3/2017 10 1 1
2 1/7/2017 9 1 0
2 1/11/2017 16 0 1
2 1/20/2017 15 1 1
Thank you so much for any help in advance!
Best regards,
C
I am sorry the Want dataset should be like this: (I forget to write the variable wanted---Enroll_Yes in the Want dataset)
group date $ age gender Enroll_Yes
1 1/1/2017 3 0 0
1 1/2/2017 1 1 0
1 1/3/2017 10 1 1
1 1/9/2017 5 0 0
1 1/11/2017 11 0 1
2 1/1/2017 12 0 0
2 1/3/2017 10 1 1
2 1/7/2017 9 1 0
2 1/11/2017 16 0 1
2 1/20/2017 15 1 1
Thank you!
data have;
input group date $ age gender ;
cards;
1 1/1/2017 3 0
1 1/2/2017 1 1
1 1/3/2017 10 1
1 1/9/2017 5 0
1 1/11/2017 11 1
2 1/1/2017 12 0
2 1/3/2017 10 1
2 1/7/2017 9 1
2 1/11/2017 16 0
2 1/20/2017 15 1
;
run;
data want;
do until(last.group);
set have;
by group;
if age>=10 and gender then _n_=0;
Enroll_Yes=_n_=0 & age>=10;
output;
end;
run;
Btw, thank you for posting a clean workable sample data. Cheers!
data want;
set have;
by group;
if first.group then _iorc_ = 0;
if age ge 10 and gender = 1 then _iorc_ = 1;
Enroll_Yes = (_iorc_ = 1 and age ge 10);
run;
Result:
group date age gender Enroll_Yes 1 1/1/2017 3 0 0 1 1/2/2017 1 1 0 1 1/3/2017 10 1 1 1 1/9/2017 5 0 0 1 1/11/201 11 1 1 2 1/1/2017 12 0 0 2 1/3/2017 10 1 1 2 1/7/2017 9 1 0 2 1/11/201 16 0 1 2 1/20/201 15 1 1
Hi @CynthiaWei Good morning. Your problem has some similarities to one of our production code at our bank that flags "debit" and "credit" transactions, however majority of the folks here are SQL users and their comfort lies with Proc SQL.
Your sample date is char format, however I'm assuming the real one is indeed a proper numeric SAS date. If not, it's trivial as it merely requires minor conversion to numeric sas date in the conditional execution.
In Proc SQL, you can use summary functions at best and perhaps push the query into the database which the folks here are good at-
data have;
input group date :mmddyy10. age gender ;
format date mmddyy10.;
cards;
1 1/1/2017 3 0
1 1/2/2017 1 1
1 1/3/2017 10 1
1 1/9/2017 5 0
1 1/11/2017 11 1
2 1/1/2017 12 0
2 1/3/2017 10 1
2 1/7/2017 9 1
2 1/11/2017 16 0
2 1/20/2017 15 1
;
run;
proc sql;
create table want as
select *,age>=10 and date>=min(case when gender and age>=10 then date else . end) as Enroll_Yes
from have
group by group
order by group, date;
quit;
Note: You could use IFN instead of CASE WHEN albeit you pay a performance penalty. On the other hand, CASE WHEN is convenient and portable as SQL engine takes care of it.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.