I have a dataset that are inpatient hospitalizations which include a unique identifier as well as the patient’s discharge date. There are patients in this dataset that have been admitted more than once. How do I write a sas code that take only the earliest possible admission date of the patient and delete all subsequent entries with later discharge dates if that particular patient happens to have multiple admissions? This will ultimately lead to the modified dataset to have all unique entries per ID. I have included a sample dataset on excel. Thank you so much for your help!
See my previous post https://communities.sas.com/t5/Base-SAS-Programming/Filtering-out-entries-that-do-not-have-a-certain... about presenting example data.
sorry, attached is the sas dataset file
Post test data, in the form of a datastep using the code window ({i} above where you post). Use this post to help you:
%macro data2datastep(dsn,lib,file,obs); %local varlist msgtype; %if %superq(obs)= %then %let obs=MAX; %let msgtype=NOTE; %if %superq(dsn)= %then %do; %let msgtype=ERROR; %put &msgtype: You must specify a data set name; %put; %goto syntax; %end; %let dsn=%qupcase(%superq(dsn)); %if %superq(dsn)=!HELP %then %do; %syntax: %put &msgtype: &SYSMACRONAME macro help document:; %put &msgtype- Purpose: Converts a data set to a SAS DATA step.; %put &msgtype- Syntax: %nrstr(%%)&SYSMACRONAME(dsn<,lib,file,obs>); %put &msgtype- dsn: Name of the dataset to be converted. Required.; %put &msgtype- lib: LIBREF where the dataset resides. Optional.; %put &msgtype- file: Fully qulaified filename for the DATA step produced. Optional.; %put &msgtype- Default is %nrstr(create_&lib._&dsn._data.sas) in the SAS default directory.; %put &msgtype- obs: Max observations to include the created dataset. Optional.; %put &msgtype- Default is MAX (all observations); %put; %put NOTE: &SYSMACRONAME cannot be used in-line - it generates code.; %put NOTE- Use !HELP to print these notes.; %return; %end; %if %superq(lib)= %then %do; %let lib=%qscan(%superq(dsn),1,.); %if %superq(lib) = %superq(dsn) %then %let lib=WORK; %else %let dsn=%qscan(&dsn,2,.); %end; %let lib=%qupcase(%superq(lib)); %let dsn=%qupcase(%superq(dsn)); %if %sysfunc(exist(&lib..&dsn)) ne 1 %then %do; %put ERROR: (&SYSMACRONAME) - Dataset &lib..&dsn does not exist.; %let msgtype=NOTE; %GoTo syntax; %end; %if %superq(file)= %then %do; %let file=create_&lib._&dsn._data.sas; %if %symexist(USERDIR) %then %let file=&userdir/&file; %end; %if %symexist(USERDIR) %then %do; %if %qscan(%superq(file),-1,/\)=%superq(file) %then %let file=&userdir/&file; %end; proc sql noprint; select Name into :varlist separated by ' ' from dictionary.columns where libname="&lib" and memname="&dsn" ; select case type when 'num' then case when missing(format) then cats(Name,':32.') else cats(Name,':',format) end else cats(Name,':$',length,'.') end into :inputlist separated by ' ' from dictionary.columns where libname="&lib" and memname="&dsn" ; quit; data _null_; file "&file" dsd; if _n_ =1 then do; put "data &lib..&dsn;"; put @3 "infile datalines dsd truncover;"; put @3 "input %superq(inputlist);"; put "datalines4;"; end; set &lib..&dsn(obs=&obs) end=last; put &varlist @; if last then do; put; put ';;;;'; end; else put; run; %mend;
Hi,
I'm having trouble trying to find where exactly are the lines where I set up my dataset into this macro. Thanks for your help!
You simply call the macro with
%data2datastep(dataset,library,filename,nobs);
as described loud and clear in https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...
Quote:
The macro call is:
%data2datastep(dsn, lib, file, obs);dsn = input data set name, required
lib = library where input data set resides, optional
file = name of output file, optional
obs = number of obs to include in the data step, optional
Sorry guys, finally got it to work. Here it is:
data MYDATA.SAMPLEEARLIESTDATE; infile datalines dsd truncover; input Date:MMDDYY10. ID:BEST. Screened:$3.; datalines4; 11/01/2017,14522778554,Yes 12/01/2017,14522778556,No 01/01/2018,14522778556,Yes 02/01/2018,14522778557,No 03/01/2018,14522778558,Yes 04/01/2018,14522778559,No 05/01/2018,14522778560,Yes 06/01/2018,14522778560,No 07/01/2018,14522778562,Yes 08/01/2018,14522778563,No 09/01/2018,14522778564,Yes 10/01/2018,14522778566,No 11/01/2018,14522778566,Yes ;;;;
@byeh2017 wrote:
Sorry guys, finally got it to work. Here it is:
data MYDATA.SAMPLEEARLIESTDATE; infile datalines dsd truncover; input Date:MMDDYY10. ID:BEST. Screened:$3.; datalines4; 11/01/2017,14522778554,Yes 12/01/2017,14522778556,No 01/01/2018,14522778556,Yes 02/01/2018,14522778557,No 03/01/2018,14522778558,Yes 04/01/2018,14522778559,No 05/01/2018,14522778560,Yes 06/01/2018,14522778560,No 07/01/2018,14522778562,Yes 08/01/2018,14522778563,No 09/01/2018,14522778564,Yes 10/01/2018,14522778566,No 11/01/2018,14522778566,Yes ;;;;
Nice, as it should be.
Use @rogerjdeangelis solution.
Is your data sorted already? This makes a big difference if the data set is large:
How do I write a sas code that take only the earliest possible admission date of
the patient and delete all subsequent entries with later discharge dates if that
particular patient happens to have multiple admissions?
HAVE
====
Obs DATE ID SCREENED
1 11/01/2017 14522778554 Yes Keep
2 12/01/2017 14522778556 No Keep
3 12/02/2017 14522778556 No
4 12/03/2017 14522778556 No
5 01/01/2018 4522778556 Yes keep
6 01/02/2018 4522778556 Yes
7 02/01/2018 4522778557 No keep
8 02/02/2018 4522778557 No
9 02/03/2018 4522778557 No
10 03/01/2018 4522778558 Yes keep
11 03/02/2018 4522778558 Yes
12 03/03/2018 4522778558 Yes
WANT
====
Obs ID DATE SCREENED
1 14522778554 11/01/2017 Yes
2 14522778556 12/01/2017 No
3 4522778556 01/01/2018 Yes
4 4522778557 02/01/2018 No
5 4522778558 03/01/2018 Yes
FULL SOLUTION
==============
* CREATE SOME DATA;
data have(drop=dte);
retain id date;
input Dte mmddyy10. ID Screened $;
do date=dte to dte+int(4*uniform(5731));
output;
end;
cards4;
11/1/2017 14522778554 Yes
12/1/2017 14522778556 No
1/1/2018 14522778556 Yes
2/1/2018 14522778557 No
3/1/2018 14522778558 Yes
4/1/2018 14522778559 No
5/1/2018 14522778560 Yes
6/1/2018 14522778560 No
7/1/2018 14522778562 Yes
8/1/2018 14522778563 No
9/1/2018 14522778564 Yes
10/1/2018 14522778566 No
11/1/2018 14522778566 Yes
;;;;
run;quit;
data want;
set have;
by id notsorted;
if first.id;
run;quit;
proc print data=want;
format date mmddyy10.;
run;quit;
Thank you. From my understanding, I'll need to sort it first by ID and then by date before I filter, right?
I'm also having a problem with using my existing dataset "mydata.sampleearliestdate" and setting a new dataset with "mydata.sampleearliestdateCLEAN" Not sure why it is not letting me create a new dataset to house to clean data
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 61 62 data mydata.sampleearliestdate; 63 set mydata.sampleearliestdateCLEAN; ERROR: File MYDATA.SAMPLEEARLIESTDATECLEAN.DATA does not exist. 64 by id notsorted; 65 if first.id; 66 run; NOTE: The SAS System stopped processing this step because of errors.
Actually nevermind about that question about the error. I realized I had use the set funciton incorrectly to reference a new dataset I wanted to create rather than an old one.
Thanks, I haven't sorted it yet. Would this be the right code to do this? Also, does this automatically save it onto the dataset or will I need to save it manually?
proc sort data=mydata.sampleearliestdate; by ID; by date; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.