I would like to create new variables ("First_A, Second_A, Third_A, etc") that records the date of when the student received his/her first "A" (and second, third, etc). For example I have the below dataset.
data have;
input ID A_grade date_grade mmddyyy.; /*where A=1 and not-A=0*/
data lines;
1 1 05/06/2012
1 1 06/09/2010
1 0 06/11/2011
1 1 08/10/2012
2 1 02/16/2010
2 0 12/03/2011
3 0 19/11/2012
4 1 01/10/2010
4 1 07/28/2012
;;
run;
My final dataset should contain the following entries:
ID First_A Second_A Third_A
1 06/09/2010 05/06/2012 08/10/2012
2 02/16/2010 . .
4 01/10/2010 07/28/2012 .
Thank you!
You need to rename the variables.
data have;
input ID A_grade date_grade mmddyy10.; /*where A=1 and not-A=0*/
cards;
1 1 05/06/2012
1 1 06/09/2010
1 0 06/11/2011
1 1 08/10/2012
2 1 02/16/2010
2 0 12/03/2011
3 0 19/11/2012
4 1 01/10/2010
4 1 07/28/2012
;
proc sort data=have(where=(a_grade=1));
by id date_grade;
run;
proc transpose data=have out=want(drop=_:) prefix=a_ ;
by id;
var date_grade;
format a_: mmddyy10.;
run;
proc print;run;
The tried and true way: two steps;
** Sort all the "A" grades by id and date **;
proc sort data=have out=have_a;
by id date;
where a=1;
run;
** Use DOW loop to put dates of A's into the corresponding A_DATE variable **;
data want (keep=id adate1-adate8);
do I=1 by 1 until (last.id);
set have_a;
by id;
array a_date {8} ;
format a_date: mmddyys10. ;
a_date{I}=date;
end;
run;
This program assumes no one has more than 8 A's, and that you don't mind having no records for students with no A's.
or using array:
data have;
input ID A_grade date_grade mmddyy10.; /*where A=1 and not-A=0*/
format date_grade mmddyy10.;
cards;
1 1 05/06/2012
1 1 06/09/2010
1 0 06/11/2011
1 1 08/10/2012
2 1 02/16/2010
2 0 12/03/2011
3 0 19/11/2012
4 1 01/10/2010
4 1 07/28/2012
;
proc sort data=have(where=(a_grade=1));
by id date_grade;
run;
data temp(keep=id First_A Second_A Third_A);
set have;
retain First_A Second_A Third_A;
by id;
array _grade(*) First_A Second_A Third_A;
if first.id then count=1;
_grade(count)=date_grade;
count+1;
if last.id then do;output;
call missing (of _grade(*));
end;
format First_A Second_A Third_A mmddyy10.;
proc print;run;
Obs ID First_A Second_A Third_A
1 1 06/09/2010 05/06/2012 08/10/2012
2 2 02/16/2010 . .
3 4 01/10/2010 07/28/2012 .
If you don't want any restrictions on the quantity of 'A's that one ID can get, and you don't care about too much of naming convention 'first, second' etc, here is another approach based on Mark's code:
proc sort data=have(where=(a_grade=1));
by id date_grade;
run;
proc sql noprint;
select cats(max(ct)) into :ct from (select count(*) as ct from have group by id);
quit;
data want;
do _n_=1 by 1 until (last.id);
set have;
by id;
array a a_1-a_&ct;
format a_: mmddyy10.;
a(_n_)=date_grade;
end;
drop a_grade date_grade;
run;
Haikuo
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.