I have survey data for participants over a year long study, but not everyone filled out a survey each month as expected. There are a lot of people in this dataset so I want to do this systematically. Some people have surveys in multiple years ex) their first survey may be in November 2021 but they have surveys through 2022.
Survey_m_y | q1 | q2 | q3 |
June 2021 | 1 | 2 | 0 |
August 2021 | 3 | 4 | 2 |
September 2021 | 4 | 0 | 3 |
I want the data to look like this:
Survey_m_y | study_time | q1 | q2 | q3 |
June 2021 | 1 | 1 | 2 | 0 |
July 2021 | 2 | |||
August 2021 | 3 | 3 | 4 | 2 |
September 2021 | 4 | 4 | 0 | 3 |
What is the easiest way to fill in the missing month/year dates so that I can assign a study_time variable for each person that has 12 observations?
Work with actual numeric SAS dates, instead of text strings like "June 2021". Numeric dates will sort properly and makes it much easier for you to detect gaps.
I have the dates and can use those, but how would I fill in the missing months still?
Example assuming you have numeric dates representing the 1st day of each month, fake data from Jan 2020 to Jan 2022
data have;
date=mdy(1,1,2020);
q1=1;
output;
date=mdy(4,1,2020);
q1=4;
output;
format date monyy7.;
run;
data all_months;
start_date=mdy(1,1,2020);
do study_time = 0 to 24;
date=intnx('month',start_date,count,'b');
output;
end;
format date monyy7.;
drop start_date;
run;
data want;
merge have all_months;
by date;
run;
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.