Hi ,
below is a table 'have'. I created the 'duration' column in it that looks at the duration between the 'enddate' of the previous observation and 'startdate' of the next record using the 'lag' function. I flagged any duration <32 as Y. For consecutive observation where the duration < 32 , thos have to be grouped together to create a unique period i.e. the dates of obs 1 and 2 have to be combined to get the new_startdate and new_enddate i.e 02/30/2017 to 8/1/2017.
It is possible that more than three obs with duration <32 could be combined to get the new_startdate and new_enddate.
Similarly for observation 3 and 4 I had the new_startdate and new_enddate i.e 10/30/2017 to 4/15/2018. I have thousands of such ids for each of which I need to do the steps to get the dataset 'want'
How can I accomplish the dataset 'want'? Please let me know. I need advice urgently.
have:
obs |
ptid |
index date |
start |
end |
duration |
flag |
1 |
1 |
2/18/2018 |
02/30/17 |
5/30/2017 |
|
|
2 |
1 |
2/18/2018 |
6/1/2017 |
8/1/2017 |
2.00 |
y |
3 |
1 |
2/18/2018 |
10/30/2017 |
11/30/2017 |
90.00 |
n |
4 |
1 |
2/18/2018 |
12/1/2017 |
4/15/2018 |
1.00 |
y |
I need to have this dataset 'want' below with the new_start and new_end columns:
obs |
ptid |
index date |
start |
end |
duration |
flag |
New_start |
New_end |
1 |
1 |
2/18/2018 |
02/30/17 |
5/30/2017 |
|
02/30/2017 |
8/1/2017 |
|
2 |
1 |
2/18/2018 |
6/1/2017 |
8/1/2017 |
2.00 |
y |
02/30/17 |
8/1/2017 |
3 |
1 |
2/18/2018 |
10/30/2017 |
11/30/2017 |
90.00 |
n |
10/30/2017 |
4/15/2018 |
4 |
1 |
2/18/2018 |
12/1/2017 |
4/15/2018 |
1.00 |
y |
10/30/2017 |
4/15/2018 |
Even a leap year has only 29 days in feb if i m not wrong 🙂
02/30/17 |
Please excuse the example. I hope its still understandable.
See if this helps
a lazy solution though, my apologies as caffeine didn't kick in today yet
data have;
input obs ptid (indexdate start end) (:mmddyy10.) duration flag $;
format indexdate start end mmddyy10.;
cards;
1 1 2/18/2018 02/28/17 5/30/2017 . .
2 1 2/18/2018 6/1/2017 8/1/2017 2 y
3 1 2/18/2018 10/30/2017 11/30/2017 90 n
4 1 2/18/2018 12/1/2017 4/15/2018 1 y
;
data w;
set have;
by ptid;
if flag ne 'y' then grp +1;
run;
proc sql;
create table want(drop=grp) as
select *,min(start) as New_start format=mmddyy10.,max(end) as new_end format=mmddyy10.
from w
group by ptid, grp;
quit;
Thanks for your reply @novinosrinand apologies for the delay. I had to stop working on this concern as data was insufficient and had to look into more urgent tasks. Nonetheless I will test it on available data and reply.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.