BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kparker
Quartz | Level 8

I have the following code; however, the sequence numbers are not being assigned correctly.  I want to create a new variable, sequence number field, and start at 1 when there is a duplicate and then count these. proc sql; create table work.readmits as select t1.encounter_id, t1.medrec_no, t1.first_nm, t1.last_nm, t1.admit_dt, t1.start_dt, t1.end_dt, t1.gender, t1.race, t1.age from inpatient; quit; proc sort data=work.readmits out=rdm; by medrec_no; run; My attempt to create a new variable by using the format statement to add a sequence number field and start at 1 when there is a duplicate which is not working properly. DATA rdm2; SET rdm; format seqno best15.; BY medrec_no; IF FIRST.medrec_no then seqno=1; else seqno=0; RUN;

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Sorry, I should have noticed that.  Take this line:

if last.medrec_no=0 then seqno=1;

Add another line just after that:

else seqno=0;

That should assign 0 to all those that are unique.

View solution in original post

15 REPLIES 15
Reeza
Super User

Please format your code so it's readable, not in a paragraph format.

kparker
Quartz | Level 8

proc sql;

create table work.readmissions as

select

      t1.encounter_id,

     t1.medrec_no,

     t1.first_nm,

     t1.last_nm,

     t1.admit_dt,

     t1.start_dt,

     t1.end_dt,

     t1.gender,

     t1.race,

     t1.age

from admclid.inpatient_2012 t1;

quit;

proc sort data=work.readmissions out=rdm;

by medrec_no;

run;

My attempt to create a new variable by using the format statement to add a sequence number field and start at 1 when there is a duplicate which is not working properly.

DATA rdm;

SET rdm;

Format seqno best15.;

BY medrec_no;

IF FIRST.medrec_no then seqno=1; else seqno=0;

RUN;

Astounding
PROC Star

What does "a duplicate" mean?  Would two records for the same MEDREC_NO be a duplicate even if the names and dates are different?

What should SEQNO be if there is only one record for a particular MEDREC_NO?

If there are 5 records for a particular MEDREC_NO, should SEQNO be 1 for the first of the five?

It is possible that your last step is all that needs to be changed:

data rdm2;

   set rdm;

   by medrec_no;

   if first.medrec_no then seqno=1;

   else seqno+1;

run;

But the program may change depending on how you answer the questions.

Good luck.

kparker
Quartz | Level 8

Good Morning, Thanks for your response and to answer your questions:

1.  An example of what I mean by duplicate is, the patient is the same but has 5 different encounters on different dates.  I want the first encounter with a sequence # of 1 and last with a sequence # of 5.

Encounter_No          Medrec_No      Start_dt          End_dt                  Seq_no

   111                               123            23DEC2014   28DEC2014          1

   122                               123            29FEB2014   02MAR2014          2

   133                               123            23MAR2014  30MAR2014          3

   144                               123            20SEP2014   25SEP2014          4

   155                               123            07APR2014   12APR2014          5

2.  Using the example above, two records for the same Medrec_no will not be duplicate because the encounter_no is different.  If the encounter_no was the same then they would be considered a duplicate.

3. I am really only interested in the records that have patients with multiple encounters. If the patient has only one encounter they would not be considered a readmission.

MarkWik
Quartz | Level 8

A sample Input dataset and wanted output dataset please? Would make it easy for people to help you

kparker
Quartz | Level 8

1.  An example of what I mean by duplicate is, the patient is the same but has 5 different encounters on different dates.  I want the first encounter with a sequence # of 1 and last with a sequence # of 5.

Encounter_No          Medrec_No      Start_dt          End_dt                  Seq_no

   111                               123            23DEC2014   28DEC2014          1

   122                               123            29FEB2014   02MAR2014          2

   133                               123            23MAR2014  30MAR2014          3

   144                               123            20SEP2014   25SEP2014          4

   155                               123            07APR2014   12APR2014          5

2.  Using the example above, two records for the same Medrec_no will not be duplicate because the encounter_no is different.  If the encounter_no was the same then they would be considered a duplicate.

3. I am really only interested in the records that have patients with multiple encounters. If the patient has only one encounter they would not be considered a readmission.

Astounding
PROC Star

Based on  your answers, here's how I would program it.  Once you have extracted the data:

proc sort data=rdm;

   by medrec_no encounter_no;

run;

data rdm2;

   set rdm;

   by medrec_no encounter_no;

   if first.encounter_no=0 or last.encounter_no=0 then duplicate_encounter_data_flag='Y';

   if  first.medrec_no then do;

      if last.medrec_no=0 then seqno=1;

   end;

   else if first.encounter_no then seqno + 1;

run;

See how that matches what you are hoping to do.

Good luck.

kparker
Quartz | Level 8

The first part of the code appears to work when checking for duplicates by encounter number and flagging with a Y. The dataset did not have any duplicate encounters so this column was blank for each observation.

The second part of the code using the medrec_no to set the seqno to 1 did not work.  All observations have a 0 for seqno even the patient with multiple encounters.

Astounding
PROC Star

That should be easy to diagnose and fix.  The necessary information that you have to post is the SAS log from that final DATA step, including any notes or warnings.

kparker
Quartz | Level 8

I did not receive any errors or warnings, however, the SAS log is below:

                                      The SAS System              07:45 Wednesday, March 4, 2015

1          ;*';*";*/;quit;run;
2          OPTIONS PAGENO=MIN;
3          %LET _CLIENTTASKLABEL='Program';
4          %LET _CLIENTPROJECTPATH='\\DHH-ISB-FS01\User Folders\kparker\My Documents\SAS Data
4        ! Sets\Readmissions.egp';
5          %LET _CLIENTPROJECTNAME='Readmissions.egp';
6          %LET _SASPROGRAMFILE=;
7         
8          ODS _ALL_ CLOSE;
9          OPTIONS DEV=ACTIVEX;
10         GOPTIONS XPIXELS=0 YPIXELS=0;
11         ODS LISTING GPATH=&sasworklocation;
12         FILENAME EGHTML TEMP;
13         ODS HTML(ID=EGHTML) FILE=EGHTML ENCODING='utf-8' STYLE=HtmlBlue
13       ! STYLESHEET=(URL="file:///C:/Program%20Files/SASHome/SASEnterpriseGuide/5.1/Styles/HtmlBlue.
13       ! css")
13       ! ATTRIBUTES=("CODEBASE"="http://www2.sas.com/codebase/graph/v93/sasgraph.exe#version=9,3")
13       ! NOGTITLE NOGFOOTNOTE GPATH=&sasworklocation;
NOTE: Writing HTML(EGHTML) Body file: EGHTML
14         FILENAME EGRTF TEMP;
15         ODS RTF(ID=EGRTF) FILE=EGRTF ENCODING='utf-8' STYLE=Rtf NOGTITLE NOGFOOTNOTE;
NOTE: Writing RTF Body file: EGRTF
16        
17         GOPTIONS ACCESSIBLE;
18         DATA rdm2; *not working;
19         SET work.rdm;
20         by encounter_id medrec_no;
21         if first.encounter_id=0 or last.encounter_id=0
22         then duplicate_encounter_data_flag="Y";
23         if first.medrec_no then do;
24         if last.medrec_no=0 then seqno=1;
25         end;
26         else if first.encounter_id then seqno + 1;
27         run;

NOTE: There were 557758 observations read from the data set WORK.RDM.
NOTE: The data set WORK.RDM2 has 557758 observations and 13 variables.
NOTE: DATA statement used (Total process time):
      real time           1.15 seconds
      cpu time            1.15 seconds
     

28        
29         GOPTIONS NOACCESSIBLE;
30         %LET _CLIENTTASKLABEL=;
31         %LET _CLIENTPROJECTPATH=;
32         %LET _CLIENTPROJECTNAME=;
33         %LET _SASPROGRAMFILE=;
34        
35         ;*';*";*/;quit;run;
36         ODS _ALL_ CLOSE;
37        
38        
39         QUIT; RUN;
2                                           The SAS System              07:45 Wednesday, March 4, 2015

40        

Astounding
PROC Star

It looks like your BY statement needs to be fixed.  It reads:

by encounter_id medrec_no;

It should read:

by medrec_no encounter_id;

That change should be made in two places, both PROC SORT and the DATA step.  Let's see if that clears it all up.

kparker
Quartz | Level 8

That fixed it!  But how can I make the other observations that are not in a sequence set to 0 or some other flag?  They are being populated with various numbers (i.e. 2, 4, 5, etc.)

naveen_srini
Quartz | Level 8

data testdata;

input e_no m_no st_dt date9. end_dt :date9.;

format st_dt end_dt date9.;

output;

if _n_=1 then output;

datalines;

111 123 23DEC2014 28DEC2014

122 123 28FEB2014 02MAR2014

133 123 23MAR2014 30MAR2014

144 123 20SEP2014 25SEP2014

;

data want_data;

set testdata;

by e_no m_no notsorted;

if e_no ne lag(e_no) then seq+1;

if first.e_no and last.e_no then considered="Noadmission";

else  considered="readmission";

run;

Astounding
PROC Star

Sorry, I should have noticed that.  Take this line:

if last.medrec_no=0 then seqno=1;

Add another line just after that:

else seqno=0;

That should assign 0 to all those that are unique.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 7712 views
  • 0 likes
  • 5 in conversation