I have data in this form:
data have;
input start $ end $ year;
datalines;
1 01APR2000 07MAR2001 2000
1 01APR2000 07MAR2001 2001
2 25JUN2000 07FEB2001 2000
2 25JUN2000 07FEB2001 2001
;
I want to add a new column that lists the number of the month, for every month in the year, starting and ending at the existing columns (start and end). I am trying to achieve (for the first person as an example):
ID start end year month
1 01APR2000 07MAR2001 2000 4
1 01APR2000 07MAR2001 2000 5
1 01APR2000 07MAR2001 2000 6
1 01APR2000 07MAR2001 2000 7
1 01APR2000 07MAR2001 2000 8
1 01APR2000 07MAR2001 2000 9
1 01APR2000 07MAR2001 2000 10
1 01APR2000 07MAR2001 2000 11
1 01APR2000 07MAR2001 2000 12
1 01APR2000 07MAR2001 2001 1
1 01APR2000 07MAR2001 2001 2
1 01APR2000 07MAR2001 2001 3
I've tried a do loop but I'm not sure how to begin back at 1 after reaching 12.
Hello @MB_Analyst Good morning
data have;
input ID start :date9. end :date9. year;
format start end date9.;
datalines;
1 01APR2000 07MAR2001 2000
1 01APR2000 07MAR2001 2001
2 25JUN2000 07FEB2001 2000
2 25JUN2000 07FEB2001 2001
;
data want;
set have;
by id;
if first.id;
_k=start;
do while(_k<=end);
month=month(_k);
year=year(_k);
output;
_k=intnx('month',_k,1);
end;
drop _:;
run;
Why are there two records for each person? Your output seems to ignore the second record for person one. Can you explain this?
data have;
input id start :anydtdte. end :anydtdte. year;
thismonth=start;
do while(thismonth<=end);
year=year(thismonth);
month=month(thismonth);
output;
thismonth=intnx('month',thismonth,1);
end;
format start end thismonth date7.;
datalines;
1 01APR2000 07MAR2001 2000
1 01APR2000 07MAR2001 2001
2 25JUN2000 07FEB2001 2000
2 25JUN2000 07FEB2001 2001
;
Anyway, this code seems to get the year and month that you want.
Advice: treating calendar information such as 01APR2000 as character is not a good thing to do. You want to treat these as SAS dates, which are integers, and you can do this by reading these values in using the ANYDTDTE. format. Treating these as SAS dates will make your life much easier, because then you can use built-in SAS date functions, such as INTCK and INTNX and YEAR and MONTH (and others).
Hello @MB_Analyst Good morning
data have;
input ID start :date9. end :date9. year;
format start end date9.;
datalines;
1 01APR2000 07MAR2001 2000
1 01APR2000 07MAR2001 2001
2 25JUN2000 07FEB2001 2000
2 25JUN2000 07FEB2001 2001
;
data want;
set have;
by id;
if first.id;
_k=start;
do while(_k<=end);
month=month(_k);
year=year(_k);
output;
_k=intnx('month',_k,1);
end;
drop _:;
run;
Seems like @novinosrin and I have come up with essentially the same solution. As I stated, by treating 01APR2000 as a date value (which is an integer) and not as a character string as in the original code is the key to making this work — and also the key to making it work with relatively simple programming.
data have;
input id start : date9. end : date9. year;
format start end date9.;
datalines;
1 01APR2000 07MAR2001 2000
1 01APR2000 07MAR2001 2001
2 25JUN2000 07FEB2001 2000
2 25JUN2000 07FEB2001 2001
;
proc sort data=have(keep=id start end) out=_have nodupkey;
by id start end;
run;
data want;
set _have;
do i=start to end;
month=month(i); year=year(i);
if month ne _month then do;_month=month;output;end;
end;
drop _month i;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.