Good morning.
I am attempting to find the first day of each month for which a record exits. For instance, person_1 may have been involved from 1/10/2012 through 8/17/2020 while another may have very different dates of involvement. What I need is the first day of each month in the range where the person existed. So person_1 would exist in 2/1/2012, 3/1/2012, 4/1/2012 all the way to 8/1/2020 and be counted on each first of each month. Unfortunately I don't have a clear "date" from which to count from I only have a range to count within.
Could anyone point me in the right direction?
Thanks very much.
Have
| person | start | end |
| PERSON_1 | 1/10/2012 | 8/17/2020 |
| PERSON_2 | 10/2/2020 | 10/30/2020 |
| PERSON_3 | 9/30/2009 | 9/30/2021 |
| PERSON_4 | 10/15/2013 | 2/19/2014 |
| PERSON_5 | 2/1/2017 | 12/31/2021 |
| PERSON_6 | 2/1/2017 | 10/3/2021 |
| PERSON_7 | 6/29/2012 | 11/30/2018 |
| PERSON_8 | 7/6/2014 | 11/15/2016 |
| PERSON_9 | 7/29/2013 | 6/14/2017 |
| PERSON_10 | 1/6/2014 | 6/19/2019 |
data have;
infile cards dlm='09'x;
input person :$10. start :mmddyy10. end :mmddyy10.;
format start end date9.;
cards;
PERSON_1 1/10/2012 8/17/2020
PERSON_2 10/2/2020 10/30/2020
PERSON_3 9/30/2009 9/30/2021
PERSON_4 10/15/2013 2/19/2014
PERSON_5 2/1/2017 12/31/2021
PERSON_6 2/1/2017 10/3/2021
PERSON_7 6/29/2012 11/30/2018
PERSON_8 7/6/2014 11/15/2016
PERSON_9 7/29/2013 6/14/2017
PERSON_10 1/6/2014 6/19/2019
PERSON_11 1/6/2014 1/20/2014
;
;
;;
run;
data want;
set have;
i=1;
if put(start, yymmn6.) ne put(end, yymmn6.) then /*what happens if the start is same year month as end?*/
do while (date<end);
date=intnx('month', start, i, 'b');
i+1;
output;
end;
else
output;
format date date9.;
run;
You don't indicate output requirements.
Do you just want to count the number of persons per month?
So something like this as output?
Obs date n_person 115 2019-03-01 5 116 2019-04-01 5 117 2019-05-01 5 118 2019-06-01 5 119 2019-07-01 4 120 2019-08-01 4 121 2019-09-01 4 122 2019-10-01 4 123 2019-11-01 4 124 2019-12-01 4
You can find the range of dates from the data:
proc sql ;
create table dates as
select min( min(start,end) ) as start format=yymmdd10.
, max( max(start,end) ) as end format=yymmdd10.
from have
;
quit;
And then blow that up to one observation per month.
data dates;
set dates;
do offset = 0 to intck('month',start,end);
date=intnx('month',start,offset);
output;
end;
format date yymmdd10.;
run;
Now you can join it with your actual data and count the number of distinct PERSON values per month.
proc sql ;
create table want as
select a.date
, count(distinct b.person) as n_person
from dates a
left join have b
on a.date between b.start and b.end
group by 1
;
quit;
data have;
infile cards dlm='09'x;
input person :$10. start :mmddyy10. end :mmddyy10.;
format start end date9.;
cards;
PERSON_1 1/10/2012 8/17/2020
PERSON_2 10/2/2020 10/30/2020
PERSON_3 9/30/2009 9/30/2021
PERSON_4 10/15/2013 2/19/2014
PERSON_5 2/1/2017 12/31/2021
PERSON_6 2/1/2017 10/3/2021
PERSON_7 6/29/2012 11/30/2018
PERSON_8 7/6/2014 11/15/2016
PERSON_9 7/29/2013 6/14/2017
PERSON_10 1/6/2014 6/19/2019
PERSON_11 1/6/2014 1/20/2014
;
;
;;
run;
data want;
set have;
i=1;
if put(start, yymmn6.) ne put(end, yymmn6.) then /*what happens if the start is same year month as end?*/
do while (date<end);
date=intnx('month', start, i, 'b');
i+1;
output;
end;
else
output;
format date date9.;
run;
You don't indicate output requirements.
I so wish there was a way to accept two solutions as correct. Both of your solutions worked so honestly I flipped a coin (who has coins these days?).
Thank you both very much for your help. Both solutions give me exactly what I need.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.