I have the following code; however, the sequence numbers are not being assigned correctly. I want to create a new variable, sequence number field, and start at 1 when there is a duplicate and then count these. proc sql; create table work.readmits as select t1.encounter_id, t1.medrec_no, t1.first_nm, t1.last_nm, t1.admit_dt, t1.start_dt, t1.end_dt, t1.gender, t1.race, t1.age from inpatient; quit; proc sort data=work.readmits out=rdm; by medrec_no; run; My attempt to create a new variable by using the format statement to add a sequence number field and start at 1 when there is a duplicate which is not working properly. DATA rdm2; SET rdm; format seqno best15.; BY medrec_no; IF FIRST.medrec_no then seqno=1; else seqno=0; RUN;
Sorry, I should have noticed that. Take this line:
if last.medrec_no=0 then seqno=1;
Add another line just after that:
else seqno=0;
That should assign 0 to all those that are unique.
Please format your code so it's readable, not in a paragraph format.
proc sql;
create table work.readmissions as
select
t1.encounter_id,
t1.medrec_no,
t1.first_nm,
t1.last_nm,
t1.admit_dt,
t1.start_dt,
t1.end_dt,
t1.gender,
t1.race,
t1.age
from admclid.inpatient_2012 t1;
quit;
proc sort data=work.readmissions out=rdm;
by medrec_no;
run;
My attempt to create a new variable by using the format statement to add a sequence number field and start at 1 when there is a duplicate which is not working properly.
DATA rdm;
SET rdm;
Format seqno best15.;
BY medrec_no;
IF FIRST.medrec_no then seqno=1; else seqno=0;
RUN;
What does "a duplicate" mean? Would two records for the same MEDREC_NO be a duplicate even if the names and dates are different?
What should SEQNO be if there is only one record for a particular MEDREC_NO?
If there are 5 records for a particular MEDREC_NO, should SEQNO be 1 for the first of the five?
It is possible that your last step is all that needs to be changed:
data rdm2;
set rdm;
by medrec_no;
if first.medrec_no then seqno=1;
else seqno+1;
run;
But the program may change depending on how you answer the questions.
Good luck.
Good Morning, Thanks for your response and to answer your questions:
1. An example of what I mean by duplicate is, the patient is the same but has 5 different encounters on different dates. I want the first encounter with a sequence # of 1 and last with a sequence # of 5.
Encounter_No Medrec_No Start_dt End_dt Seq_no
111 123 23DEC2014 28DEC2014 1
122 123 29FEB2014 02MAR2014 2
133 123 23MAR2014 30MAR2014 3
144 123 20SEP2014 25SEP2014 4
155 123 07APR2014 12APR2014 5
2. Using the example above, two records for the same Medrec_no will not be duplicate because the encounter_no is different. If the encounter_no was the same then they would be considered a duplicate.
3. I am really only interested in the records that have patients with multiple encounters. If the patient has only one encounter they would not be considered a readmission.
A sample Input dataset and wanted output dataset please? Would make it easy for people to help you
1. An example of what I mean by duplicate is, the patient is the same but has 5 different encounters on different dates. I want the first encounter with a sequence # of 1 and last with a sequence # of 5.
Encounter_No Medrec_No Start_dt End_dt Seq_no
111 123 23DEC2014 28DEC2014 1
122 123 29FEB2014 02MAR2014 2
133 123 23MAR2014 30MAR2014 3
144 123 20SEP2014 25SEP2014 4
155 123 07APR2014 12APR2014 5
2. Using the example above, two records for the same Medrec_no will not be duplicate because the encounter_no is different. If the encounter_no was the same then they would be considered a duplicate.
3. I am really only interested in the records that have patients with multiple encounters. If the patient has only one encounter they would not be considered a readmission.
Based on your answers, here's how I would program it. Once you have extracted the data:
proc sort data=rdm;
by medrec_no encounter_no;
run;
data rdm2;
set rdm;
by medrec_no encounter_no;
if first.encounter_no=0 or last.encounter_no=0 then duplicate_encounter_data_flag='Y';
if first.medrec_no then do;
if last.medrec_no=0 then seqno=1;
end;
else if first.encounter_no then seqno + 1;
run;
See how that matches what you are hoping to do.
Good luck.
The first part of the code appears to work when checking for duplicates by encounter number and flagging with a Y. The dataset did not have any duplicate encounters so this column was blank for each observation.
The second part of the code using the medrec_no to set the seqno to 1 did not work. All observations have a 0 for seqno even the patient with multiple encounters.
That should be easy to diagnose and fix. The necessary information that you have to post is the SAS log from that final DATA step, including any notes or warnings.
I did not receive any errors or warnings, however, the SAS log is below:
The SAS System 07:45 Wednesday, March 4, 2015
1 ;*';*";*/;quit;run;
2 OPTIONS PAGENO=MIN;
3 %LET _CLIENTTASKLABEL='Program';
4 %LET _CLIENTPROJECTPATH='\\DHH-ISB-FS01\User Folders\kparker\My Documents\SAS Data
4 ! Sets\Readmissions.egp';
5 %LET _CLIENTPROJECTNAME='Readmissions.egp';
6 %LET _SASPROGRAMFILE=;
7
8 ODS _ALL_ CLOSE;
9 OPTIONS DEV=ACTIVEX;
10 GOPTIONS XPIXELS=0 YPIXELS=0;
11 ODS LISTING GPATH=&sasworklocation;
12 FILENAME EGHTML TEMP;
13 ODS HTML(ID=EGHTML) FILE=EGHTML ENCODING='utf-8' STYLE=HtmlBlue
13 ! STYLESHEET=(URL="file:///C:/Program%20Files/SASHome/SASEnterpriseGuide/5.1/Styles/HtmlBlue.
13 ! css")
13 ! ATTRIBUTES=("CODEBASE"="http://www2.sas.com/codebase/graph/v93/sasgraph.exe#version=9,3")
13 ! NOGTITLE NOGFOOTNOTE GPATH=&sasworklocation;
NOTE: Writing HTML(EGHTML) Body file: EGHTML
14 FILENAME EGRTF TEMP;
15 ODS RTF(ID=EGRTF) FILE=EGRTF ENCODING='utf-8' STYLE=Rtf NOGTITLE NOGFOOTNOTE;
NOTE: Writing RTF Body file: EGRTF
16
17 GOPTIONS ACCESSIBLE;
18 DATA rdm2; *not working;
19 SET work.rdm;
20 by encounter_id medrec_no;
21 if first.encounter_id=0 or last.encounter_id=0
22 then duplicate_encounter_data_flag="Y";
23 if first.medrec_no then do;
24 if last.medrec_no=0 then seqno=1;
25 end;
26 else if first.encounter_id then seqno + 1;
27 run;
NOTE: There were 557758 observations read from the data set WORK.RDM.
NOTE: The data set WORK.RDM2 has 557758 observations and 13 variables.
NOTE: DATA statement used (Total process time):
real time 1.15 seconds
cpu time 1.15 seconds
28
29 GOPTIONS NOACCESSIBLE;
30 %LET _CLIENTTASKLABEL=;
31 %LET _CLIENTPROJECTPATH=;
32 %LET _CLIENTPROJECTNAME=;
33 %LET _SASPROGRAMFILE=;
34
35 ;*';*";*/;quit;run;
36 ODS _ALL_ CLOSE;
37
38
39 QUIT; RUN;
2 The SAS System 07:45 Wednesday, March 4, 2015
40
It looks like your BY statement needs to be fixed. It reads:
by encounter_id medrec_no;
It should read:
by medrec_no encounter_id;
That change should be made in two places, both PROC SORT and the DATA step. Let's see if that clears it all up.
That fixed it! But how can I make the other observations that are not in a sequence set to 0 or some other flag? They are being populated with various numbers (i.e. 2, 4, 5, etc.)
data testdata;
input e_no m_no st_dt date9. end_dt :date9.;
format st_dt end_dt date9.;
output;
if _n_=1 then output;
datalines;
111 123 23DEC2014 28DEC2014
122 123 28FEB2014 02MAR2014
133 123 23MAR2014 30MAR2014
144 123 20SEP2014 25SEP2014
;
data want_data;
set testdata;
by e_no m_no notsorted;
if e_no ne lag(e_no) then seq+1;
if first.e_no and last.e_no then considered="Noadmission";
else considered="readmission";
run;
Sorry, I should have noticed that. Take this line:
if last.medrec_no=0 then seqno=1;
Add another line just after that:
else seqno=0;
That should assign 0 to all those that are unique.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.