Hi ,
below is a table 'have'. I created the 'duration' column in it that looks at the duration between the 'enddate' of the previous observation and 'startdate' of the next record using the 'lag' function. I flagged any duration <32 as Y. For consecutive observation where the duration < 32 , thos have to be grouped together to create a unique period i.e. the dates of obs 1 and 2 have to be combined to get the new_startdate and new_enddate i.e 02/30/2017 to 8/1/2017.
It is possible that more than three obs with duration <32 could be combined to get the new_startdate and new_enddate.
Similarly for observation 3 and 4 I had the new_startdate and new_enddate i.e 10/30/2017 to 4/15/2018. I have thousands of such ids for each of which I need to do the steps to get the dataset 'want'
How can I accomplish the dataset 'want'? Please let me know. I need advice urgently.
have:
|
obs |
ptid |
index date |
start |
end |
duration |
flag |
|
1 |
1 |
2/18/2018 |
02/30/17 |
5/30/2017 |
|
|
|
2 |
1 |
2/18/2018 |
6/1/2017 |
8/1/2017 |
2.00 |
y |
|
3 |
1 |
2/18/2018 |
10/30/2017 |
11/30/2017 |
90.00 |
n |
|
4 |
1 |
2/18/2018 |
12/1/2017 |
4/15/2018 |
1.00 |
y |
I need to have this dataset 'want' below with the new_start and new_end columns:
|
obs |
ptid |
index date |
start |
end |
duration |
flag |
New_start |
New_end |
|
1 |
1 |
2/18/2018 |
02/30/17 |
5/30/2017 |
|
02/30/2017 |
8/1/2017 |
|
|
2 |
1 |
2/18/2018 |
6/1/2017 |
8/1/2017 |
2.00 |
y |
02/30/17 |
8/1/2017 |
|
3 |
1 |
2/18/2018 |
10/30/2017 |
11/30/2017 |
90.00 |
n |
10/30/2017 |
4/15/2018 |
|
4 |
1 |
2/18/2018 |
12/1/2017 |
4/15/2018 |
1.00 |
y |
10/30/2017 |
4/15/2018 |
Even a leap year has only 29 days in feb if i m not wrong 🙂
|
02/30/17 |
Please excuse the example. I hope its still understandable.
See if this helps
a lazy solution though, my apologies as caffeine didn't kick in today yet
data have;
input obs ptid (indexdate start end) (:mmddyy10.) duration flag $;
format indexdate start end mmddyy10.;
cards;
1 1 2/18/2018 02/28/17 5/30/2017 . .
2 1 2/18/2018 6/1/2017 8/1/2017 2 y
3 1 2/18/2018 10/30/2017 11/30/2017 90 n
4 1 2/18/2018 12/1/2017 4/15/2018 1 y
;
data w;
set have;
by ptid;
if flag ne 'y' then grp +1;
run;
proc sql;
create table want(drop=grp) as
select *,min(start) as New_start format=mmddyy10.,max(end) as new_end format=mmddyy10.
from w
group by ptid, grp;
quit;
Thanks for your reply @novinosrinand apologies for the delay. I had to stop working on this concern as data was insufficient and had to look into more urgent tasks. Nonetheless I will test it on available data and reply.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.