Hello,
I have an SAS dataset in a long format. The intervention program started from 2014 Spring semester, and it has been on until 2017 Spring semester. So there has been 7 semesters (2014 Spring and Fall, 2015 Spring and Fall, 2016 Spring and Fall, 2017 Spring).
ID Year Semester
A 2016 Spring
A 2016 Fall
B 2014 Spring
B 2015 Spring
B 2015 Fall
B 2016 Spring
B 2016 Spring
As you can see, not everyone participated in all 7 semesters though. Some participated once and never came back, some participated more than twice but not necessarily two semesters in a low.
So each individual has a different number of cases. For someone who participated twice, for example, has 2 rows, some with 5 participations have 5 rows. I want everyone has 7 rows in the dataset for some reason. It would look like this, for example, for ID=A. Same for B and other individuals as well.
ID Year Semester
A 2014 Spring
A 2014 Fall
A 2015 Spring
A 2015 Fall
A 2016 Spring
A 2016 Fall
A 2017 Spring
What could be the best way of programming to do this in SAS?
I would really appreciate any suggestions!
You probably want to make a dummy full rank dataset and then combine it back with your actual data.
Make a small dataset with the possible values of of YEAR and SEMESTER
data terms;
input year semester $;
cards;
2014 Spring
2014 Fall
2015 Spring
2015 Fall
2016 Spring
2016 Fall
2017 Spring
;
and merge it with all possible ID values.
proc sql ;
create table all_terms as
select *
from (select distinct id from HAVE), terms
order by 1,2,3
;
quit;
Then merge it back with you original data.
data want ;
merge all_terms have;
by id year semester;
run;
data want;
do id= 'A', 'B';
do demester='Spring','Fall';
do year=2014 to 2016;
output;
end;
end;
end;
do id= 'A', 'B';
do demester='Spring';
do year=2017;
output;
end;
end;
end;
run;
So you could have any number of individuals like A,B,C,D and so on. And the year range seems to tbe variable of interest with 2017 having just one semester
data have;
input ID $ Year Semester $;
datalines;
A 2016 Spring
A 2016 Fall
B 2014 Spring
B 2015 Spring
B 2015 Fall
B 2016 Spring
B 2016 Spring
;
data want;
do until(last.id);
set have;
by id;
if last.id then do;
do year=2014 to 2017;
if year ne 2017 then do;
semester='spring';
output;
semester='fall';
output;
end;
else do;
semester='spring';
output;
end;
end;
end;
end;
run;
You probably want to make a dummy full rank dataset and then combine it back with your actual data.
Make a small dataset with the possible values of of YEAR and SEMESTER
data terms;
input year semester $;
cards;
2014 Spring
2014 Fall
2015 Spring
2015 Fall
2016 Spring
2016 Fall
2017 Spring
;
and merge it with all possible ID values.
proc sql ;
create table all_terms as
select *
from (select distinct id from HAVE), terms
order by 1,2,3
;
quit;
Then merge it back with you original data.
data want ;
merge all_terms have;
by id year semester;
run;
I forgot to ask.
Could you explain "order by 1,2,3"? I cannot figure out what it does!
Thanks.
Read the section of the PROC SQL manual on the ORDER BY CLAUSE.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.