Hi Community,
I have a requirement to summarize a table with dates. Any help is appreciated.
Raw data:
SubjectName | QueryID | ParentQueryID | OpenDate |
0001-S002 | 818930 | 818929 | 21-Nov-22 |
0001-S002 | 818929 | 782251 | 21-Nov-22 |
0001-S002 | 782251 | 24-Oct-22 | |
01659-S031 | 962154 | 7-May-24 | |
01659-S031 | 962200 | 962154 | 8-May-24 |
01659-S031 | 962462 | 962200 | 17-May-24 |
Each Query has a unique ID and any Re-query has a new Query ID with the parent Query ID in a second column. I am trying to summarize the information based on the master Query ID and Query Start and Query Stop dates.
Expected output:
SubjectName | Master QueryID | Query Start | QueryEnd |
0001-S002 | 782251 | 24-Oct-22 | 21-Nov-22 |
0001-S031 | 962154 | 7-May-24 | 17-May-24 |
Please advise with any approach/logic.
Any help is appreaciated. Thank you for your time.
What you gonna do ? If there were not link in ID . Like:
0001-S002 818930 818929 21-Nov-22 0001-S002 818929 782251 21-Nov-22 0001-S002 782251 . 24-Oct-22 01659-S031 962154 . 7-May-24 01659-S031 962200 962154 8-May-24 01659-S031 962462 962200 17-May-24 01659-S031 222222 333333 27-May-24
Assuming there are always links in your data.
data work.have;
input SubjectName :$10. QueryID ParentQueryID OpenDate :date11.;
format opendate date11.;
datalines;
0001-S002 818930 818929 21-Nov-22
0001-S002 818929 782251 21-Nov-22
0001-S002 782251 . 24-Oct-22
01659-S031 962154 . 7-May-24
01659-S031 962200 962154 8-May-24
01659-S031 962462 962200 17-May-24
;
proc sql;
create table want as
select SubjectName,
(select QueryID from have where SubjectName=a.SubjectName and ParentQueryID is missing) as Master_QueryID,
min(OpenDate) as Query_Start format=date11.,max(OpenDate) as Query_End format=date11.
from have as a
group by SubjectName;
quit;
If your actual data is as simple as the posted sample, then why not just set querystart = min(OpenDate) and queryend = max(OpenDate) ? No need to bother with some fancy logic traversing query ID's and Parent query ID's.
If you have any SubjectName that have more than one "master" query you should include an example.
Otherwise perhaps:
data work.have; input SubjectName :$10. QueryID ParentQueryID OpenDate :date11.; format opendate date11.; datalines; 0001-S002 818930 818929 21-Nov-22 0001-S002 818929 782251 21-Nov-22 0001-S002 782251 . 24-Oct-22 01659-S031 962154 . 7-May-24 01659-S031 962200 962154 8-May-24 01659-S031 962462 962200 17-May-24 ; proc summary data=work.have idmin nway; class subjectname; id queryid; var opendate ; output out=work.want (drop= _:) min=querystart max=queryend; run;
What you gonna do ? If there were not link in ID . Like:
0001-S002 818930 818929 21-Nov-22 0001-S002 818929 782251 21-Nov-22 0001-S002 782251 . 24-Oct-22 01659-S031 962154 . 7-May-24 01659-S031 962200 962154 8-May-24 01659-S031 962462 962200 17-May-24 01659-S031 222222 333333 27-May-24
Assuming there are always links in your data.
data work.have;
input SubjectName :$10. QueryID ParentQueryID OpenDate :date11.;
format opendate date11.;
datalines;
0001-S002 818930 818929 21-Nov-22
0001-S002 818929 782251 21-Nov-22
0001-S002 782251 . 24-Oct-22
01659-S031 962154 . 7-May-24
01659-S031 962200 962154 8-May-24
01659-S031 962462 962200 17-May-24
;
proc sql;
create table want as
select SubjectName,
(select QueryID from have where SubjectName=a.SubjectName and ParentQueryID is missing) as Master_QueryID,
min(OpenDate) as Query_Start format=date11.,max(OpenDate) as Query_End format=date11.
from have as a
group by SubjectName;
quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.