Hello everyone, I would like to use sql to get distinct observations for each patient that corresponds to their earliest fill date. I saw the attached code on the forum but it does not give the output as distinct observations. Thank you
/**what i have***/
patient ID fill_date
patient 1 02/23/2001
patient 1 02/25/2001
patient 1 03/14/2003
patient 2 01/01/2006
patient 2 10/14/2001
patient 2 08/25/2008
patient 3 07/14/2004
/**what i want**/
patient ID fill_date
patient 1 02/23/2001
patient 2 10/14/2001
patient 3 07/14/2004
proc sql;
create table want as
select *
from have
group by PATIENT_ID, FILL_DATE
having FILL_DATE = MIN (FILL_DATE) ;
quit;
If you just want the earliest date then tell SQL that fact.
create table want as
select PATIENT_ID
, min(FILL_DATE) as INDEX_DATE format=yymmdd10.
from have
group by PATIENT_ID
;
Try
proc sql; create table want as select * from have group by PATIENT_ID having FILL_DATE = MIN (FILL_DATE) ; quit;
When you have Group by Patient_id, Fill_date you have told SQL that combinations of Patient and date are what the Min is calculated from. In the example that means the Min of each date, not each patient
Thank you, it gives me more observation than i started with (750,758), and a caution in my log as shown
proc sql; 56 create table XX as 57 select * 58 from YY 59 group by PATIENT_ID 60 having FILL_DATE = MIN (FILL_DATE) ; NOTE: The query requires remerging summary statistics back with the original data. NOTE: Table WORK.XX created, with 1963821 rows and 7 columns. 61 quit; NOTE: PROCEDURE SQL used (Total process time): real time 5.70 seconds cpu time 5.33 seconds
I doubt it (unless perhaps YY is actually a reference to a dataset (table) in some external database that does strange things).
For example try this query whose form matches yours.
685 proc sql; 686 create table class as select * from sashelp.class; NOTE: Table WORK.CLASS created, with 19 rows and 5 columns. 687 create table youngest as select * from class group by sex having age=min(age); NOTE: The query requires remerging summary statistics back with the original data. NOTE: Table WORK.YOUNGEST created, with 2 rows and 5 columns.
Now you might get more observations than there are patients if there are multiple observations for a patient that tie for the earliest date.
If you just want the earliest date then tell SQL that fact.
create table want as
select PATIENT_ID
, min(FILL_DATE) as INDEX_DATE format=yymmdd10.
from have
group by PATIENT_ID
;
Your dates are dates, not strings looking like dates?
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.