Hi All;
Please help me in the program code for the below query:-
Write a SAS query to select records of patients who have at least two diagnoses of diabetes in the calendar year.
Input data sample:
Pat Id | Diagnosis | Date |
1001 | RA | 1-Jan-22 |
1001 | Diabetes | 12-Jan-22 |
1001 | Diabetes | 1-Mar-22 |
1002 | IBD | 4-Jan-22 |
1002 | Diabetes | 14-Jan-22 |
Can anyone please tell me the program code to get the output below like:-
Output data sample:
Pat Id | Diagnosis | Year |
1001 | Diabetes | 2022 |
@saitejaguduru97 , if you mean a Data Step instead of a Proc SQL Step, here is one.
data have;
input PatId $ Diagnosis $ Date :anydtdte.;
format Date date9.;
datalines;
1001 RA 1-Jan-22
1001 Diabetes 12-Jan-22
1001 Diabetes 1-Mar-22
1002 IBD 4-Jan-22
1002 Diabetes 14-Jan-22
;
data want(drop = c);
set have;
by PatId Date groupformat;
where Diagnosis = 'Diabetes';
if first.Date then c = 0;
c + 1;
if last.Date & c >= 2;
format date year.;
run;
Try this
data have;
input PatId $ Diagnosis $ Date :anydtdte.;
format Date date9.;
datalines;
1001 RA 1-Jan-22
1001 Diabetes 12-Jan-22
1001 Diabetes 1-Mar-22
1002 IBD 4-Jan-22
1002 Diabetes 14-Jan-22
;
proc sql;
create table want as
select distinct PatId
, Diagnosis
, year(Date) as Year
from have
where Diagnosis = 'Diabetes'
group by PatID
having sum(Diagnosis = 'Diabetes') >= 2
;
quit;
Hi Peter,
It may not matter much in practice, but I would change the HAVING clause from
having sum(Diagnosis = 'Diabetes') >= 2
to
having count(*) >= 2
as the diagnoses selected with the WHERE clause are already "Diabetes".
This will make the code a bit simpler, and if you change the diagnosis to check for, you only have to change the code in the WHERE clause (it may also run slightly faster, but I doubt that will matter much).
Thank you for the reply!!!
Is there any possibility to get it in the SAS programming instead of SQL?
@saitejaguduru97 , if you mean a Data Step instead of a Proc SQL Step, here is one.
data have;
input PatId $ Diagnosis $ Date :anydtdte.;
format Date date9.;
datalines;
1001 RA 1-Jan-22
1001 Diabetes 12-Jan-22
1001 Diabetes 1-Mar-22
1002 IBD 4-Jan-22
1002 Diabetes 14-Jan-22
;
data want(drop = c);
set have;
by PatId Date groupformat;
where Diagnosis = 'Diabetes';
if first.Date then c = 0;
c + 1;
if last.Date & c >= 2;
format date year.;
run;
@s_lassen , agree. I wrote the having stuff before realizing that only 1 obs with the 'Diabetes' Diagnosis should be output 🙂
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.