BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
NewUsrStat
Pyrite | Level 9

Hi guys, 

suppose to have the following dataset DB. Is there a way to get DB1? In other words where there are repeated (identical) dates the Index should be 1. NB: all IDs have only one record (row) where Index = 1. 

 

Thank you in advance

data DB;
  input ID :$20. Admission :date09. Discharge :date09. Index;
  format Admission date9. Discharge date9.;
cards;
0001  13JAN2017 25JAN2017  0 
0001  13JAN2017 25JAN2017  1
0001  22FEB2018 03MAR2018  0   
0001  30JAN2019 04MAR2019  0  
0002  01DEC2018 14DEC2018  1 
0002  01DEC2018 14DEC2018  0 
0003  09JAN2016 25JAN2016  1 
0003  29JAN2018 12FEB2018  0; 
run;


data DB1;
  input ID :$20. Admission :date09. Discharge :date09. Index;
  format Admission date9. Discharge date9.;
cards;
0001  13JAN2017 25JAN2017  1 
0001  13JAN2017 25JAN2017  1
0001  22FEB2018 03MAR2018  0   
0001  30JAN2019 04MAR2019  0  
0002  01DEC2018 14DEC2018  1 
0002  01DEC2018 14DEC2018  1 
0003  09JAN2016 25JAN2016  1 
0003  29JAN2018 12FEB2018  0; 
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@NewUsrStat wrote:
So if the dates are duplicated and at least for one record Index =1 then also index should be duplicated to be 1 otherwise nothing to do.

So if any of the observations in the group have a 1 then value should be one?

That is essentially a MAX() operation across all of the observations in the group.

proc sql ;
create table want as 
  select DB.ID
       , DB.Admission
       , DB.Discharge
       , DB.Index
       , max(DB.Index) as New_Index
  from DB
  group by DB.ID
         , DB.Admission
;
quit;

Result

Tom_0-1719427378124.png

 

Note: You might need to make new test data if you want to actually test this new logic.  For example a group of observations with the same ID and admission date where none of them currently have INDEX=1.

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

We have already shown you solutions on two very similar data sets in these threads: (thread 1 thread 2)

Can you show us what you have tried for this problem?

--
Paige Miller
Tom
Super User Tom
Super User

Are you saying you want to new variable to be TRUE when there are multiple observations for the same ID and admission date?

data DB;
  input ID :$20. Admission :date09. Discharge :date09. Index;
  format Admission Discharge date9.;
cards;
0001  13JAN2017 25JAN2017  0 
0001  13JAN2017 25JAN2017  1
0001  22FEB2018 03MAR2018  0   
0001  30JAN2019 04MAR2019  0  
0002  01DEC2018 14DEC2018  1 
0002  01DEC2018 14DEC2018  0 
0003  09JAN2016 25JAN2016  1 
0003  29JAN2018 12FEB2018  0
; 

data want;
  set db ;
  by id admission ;
  new_flag = not (first.admission and last.admission);
run;

Result

Obs     ID     Admission    Discharge    Index    new_flag

 1     0001    13JAN2017    25JAN2017      0          1
 2     0001    13JAN2017    25JAN2017      1          1
 3     0001    22FEB2018    03MAR2018      0          0
 4     0001    30JAN2019    04MAR2019      0          0
 5     0002    01DEC2018    14DEC2018      1          1
 6     0002    01DEC2018    14DEC2018      0          1
 7     0003    09JAN2016    25JAN2016      1          0
 8     0003    29JAN2018    12FEB2018      0          0
NewUsrStat
Pyrite | Level 9
Hi Tom thank you for your help.
I tried to apply your code but there are cases where dates are duplicated and Index = 0 that with your code become flagged 1 while they will stay = 0.
Tom
Super User Tom
Super User

@NewUsrStat wrote:
Hi Tom thank you for your help.
I tried to apply your code but there are cases where dates are duplicated and Index = 0 that with your code become flagged 1 while they will stay = 0.

So you need to actually explain what the rule is for calculating this new variable.

NewUsrStat
Pyrite | Level 9
So if the dates are duplicated and at least for one record Index =1 then also index should be duplicated to be 1 otherwise nothing to do.
Tom
Super User Tom
Super User

@NewUsrStat wrote:
So if the dates are duplicated and at least for one record Index =1 then also index should be duplicated to be 1 otherwise nothing to do.

So if any of the observations in the group have a 1 then value should be one?

That is essentially a MAX() operation across all of the observations in the group.

proc sql ;
create table want as 
  select DB.ID
       , DB.Admission
       , DB.Discharge
       , DB.Index
       , max(DB.Index) as New_Index
  from DB
  group by DB.ID
         , DB.Admission
;
quit;

Result

Tom_0-1719427378124.png

 

Note: You might need to make new test data if you want to actually test this new logic.  For example a group of observations with the same ID and admission date where none of them currently have INDEX=1.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 405 views
  • 2 likes
  • 3 in conversation