Hello,
I have an SAS dataset in a long format. The intervention program started from 2014 Spring semester, and it has been on until 2017 Spring semester. So there has been 7 semesters (2014 Spring and Fall, 2015 Spring and Fall, 2016 Spring and Fall, 2017 Spring).
ID Year Semester
A 2016 Spring
A 2016 Fall
B 2014 Spring
B 2015 Spring
B 2015 Fall
B 2016 Spring
B 2016 Spring
As you can see, not everyone participated in all 7 semesters though. Some participated once and never came back, some participated more than twice but not necessarily two semesters in a low.
So each individual has a different number of cases. For someone who participated twice, for example, has 2 rows, some with 5 participations have 5 rows. I want everyone has 7 rows in the dataset for some reason. It would look like this, for example, for ID=A. Same for B and other individuals as well.
ID Year Semester
A 2014 Spring
A 2014 Fall
A 2015 Spring
A 2015 Fall
A 2016 Spring
A 2016 Fall
A 2017 Spring
What could be the best way of programming to do this in SAS?
I would really appreciate any suggestions!
You probably want to make a dummy full rank dataset and then combine it back with your actual data.
Make a small dataset with the possible values of of YEAR and SEMESTER
data terms;
input year semester $;
cards;
2014 Spring
2014 Fall
2015 Spring
2015 Fall
2016 Spring
2016 Fall
2017 Spring
;
and merge it with all possible ID values.
proc sql ;
create table all_terms as
select *
from (select distinct id from HAVE), terms
order by 1,2,3
;
quit;
Then merge it back with you original data.
data want ;
merge all_terms have;
by id year semester;
run;
data want;
do id= 'A', 'B';
do demester='Spring','Fall';
do year=2014 to 2016;
output;
end;
end;
end;
do id= 'A', 'B';
do demester='Spring';
do year=2017;
output;
end;
end;
end;
run;
So you could have any number of individuals like A,B,C,D and so on. And the year range seems to tbe variable of interest with 2017 having just one semester
data have;
input ID $ Year Semester $;
datalines;
A 2016 Spring
A 2016 Fall
B 2014 Spring
B 2015 Spring
B 2015 Fall
B 2016 Spring
B 2016 Spring
;
data want;
do until(last.id);
set have;
by id;
if last.id then do;
do year=2014 to 2017;
if year ne 2017 then do;
semester='spring';
output;
semester='fall';
output;
end;
else do;
semester='spring';
output;
end;
end;
end;
end;
run;
You probably want to make a dummy full rank dataset and then combine it back with your actual data.
Make a small dataset with the possible values of of YEAR and SEMESTER
data terms;
input year semester $;
cards;
2014 Spring
2014 Fall
2015 Spring
2015 Fall
2016 Spring
2016 Fall
2017 Spring
;
and merge it with all possible ID values.
proc sql ;
create table all_terms as
select *
from (select distinct id from HAVE), terms
order by 1,2,3
;
quit;
Then merge it back with you original data.
data want ;
merge all_terms have;
by id year semester;
run;
I forgot to ask.
Could you explain "order by 1,2,3"? I cannot figure out what it does!
Thanks.
Read the section of the PROC SQL manual on the ORDER BY CLAUSE.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.