Hi All;
Please help me in the program code for the below query:-
Write a SAS query to select records of patients who have at least two diagnoses of diabetes in the calendar year.
Input data sample:
Pat Id | Diagnosis | Date |
1001 | RA | 1-Jan-22 |
1001 | Diabetes | 12-Jan-22 |
1001 | Diabetes | 1-Mar-22 |
1002 | IBD | 4-Jan-22 |
1002 | Diabetes | 14-Jan-22 |
Can anyone please tell me the program code to get the output below like:-
Output data sample:
Pat Id | Diagnosis | Year |
1001 | Diabetes | 2022 |
@saitejaguduru97 , if you mean a Data Step instead of a Proc SQL Step, here is one.
data have;
input PatId $ Diagnosis $ Date :anydtdte.;
format Date date9.;
datalines;
1001 RA 1-Jan-22
1001 Diabetes 12-Jan-22
1001 Diabetes 1-Mar-22
1002 IBD 4-Jan-22
1002 Diabetes 14-Jan-22
;
data want(drop = c);
set have;
by PatId Date groupformat;
where Diagnosis = 'Diabetes';
if first.Date then c = 0;
c + 1;
if last.Date & c >= 2;
format date year.;
run;
Try this
data have;
input PatId $ Diagnosis $ Date :anydtdte.;
format Date date9.;
datalines;
1001 RA 1-Jan-22
1001 Diabetes 12-Jan-22
1001 Diabetes 1-Mar-22
1002 IBD 4-Jan-22
1002 Diabetes 14-Jan-22
;
proc sql;
create table want as
select distinct PatId
, Diagnosis
, year(Date) as Year
from have
where Diagnosis = 'Diabetes'
group by PatID
having sum(Diagnosis = 'Diabetes') >= 2
;
quit;
Hi Peter,
It may not matter much in practice, but I would change the HAVING clause from
having sum(Diagnosis = 'Diabetes') >= 2
to
having count(*) >= 2
as the diagnoses selected with the WHERE clause are already "Diabetes".
This will make the code a bit simpler, and if you change the diagnosis to check for, you only have to change the code in the WHERE clause (it may also run slightly faster, but I doubt that will matter much).
Thank you for the reply!!!
Is there any possibility to get it in the SAS programming instead of SQL?
@saitejaguduru97 , if you mean a Data Step instead of a Proc SQL Step, here is one.
data have;
input PatId $ Diagnosis $ Date :anydtdte.;
format Date date9.;
datalines;
1001 RA 1-Jan-22
1001 Diabetes 12-Jan-22
1001 Diabetes 1-Mar-22
1002 IBD 4-Jan-22
1002 Diabetes 14-Jan-22
;
data want(drop = c);
set have;
by PatId Date groupformat;
where Diagnosis = 'Diabetes';
if first.Date then c = 0;
c + 1;
if last.Date & c >= 2;
format date year.;
run;
@s_lassen , agree. I wrote the having stuff before realizing that only 1 obs with the 'Diabetes' Diagnosis should be output 🙂
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.