Hi guys,
suppose to have the following dataset DB. Is there a way to get DB1? In other words where there are repeated (identical) dates the Index should be 1. NB: all IDs have only one record (row) where Index = 1.
Thank you in advance
data DB;
input ID :$20. Admission :date09. Discharge :date09. Index;
format Admission date9. Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 0
0001 13JAN2017 25JAN2017 1
0001 22FEB2018 03MAR2018 0
0001 30JAN2019 04MAR2019 0
0002 01DEC2018 14DEC2018 1
0002 01DEC2018 14DEC2018 0
0003 09JAN2016 25JAN2016 1
0003 29JAN2018 12FEB2018 0;
run;
data DB1;
input ID :$20. Admission :date09. Discharge :date09. Index;
format Admission date9. Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 1
0001 13JAN2017 25JAN2017 1
0001 22FEB2018 03MAR2018 0
0001 30JAN2019 04MAR2019 0
0002 01DEC2018 14DEC2018 1
0002 01DEC2018 14DEC2018 1
0003 09JAN2016 25JAN2016 1
0003 29JAN2018 12FEB2018 0;
run;
@NewUsrStat wrote:
So if the dates are duplicated and at least for one record Index =1 then also index should be duplicated to be 1 otherwise nothing to do.
So if any of the observations in the group have a 1 then value should be one?
That is essentially a MAX() operation across all of the observations in the group.
proc sql ;
create table want as
select DB.ID
, DB.Admission
, DB.Discharge
, DB.Index
, max(DB.Index) as New_Index
from DB
group by DB.ID
, DB.Admission
;
quit;
Result
Note: You might need to make new test data if you want to actually test this new logic. For example a group of observations with the same ID and admission date where none of them currently have INDEX=1.
Are you saying you want to new variable to be TRUE when there are multiple observations for the same ID and admission date?
data DB;
input ID :$20. Admission :date09. Discharge :date09. Index;
format Admission Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 0
0001 13JAN2017 25JAN2017 1
0001 22FEB2018 03MAR2018 0
0001 30JAN2019 04MAR2019 0
0002 01DEC2018 14DEC2018 1
0002 01DEC2018 14DEC2018 0
0003 09JAN2016 25JAN2016 1
0003 29JAN2018 12FEB2018 0
;
data want;
set db ;
by id admission ;
new_flag = not (first.admission and last.admission);
run;
Result
Obs ID Admission Discharge Index new_flag 1 0001 13JAN2017 25JAN2017 0 1 2 0001 13JAN2017 25JAN2017 1 1 3 0001 22FEB2018 03MAR2018 0 0 4 0001 30JAN2019 04MAR2019 0 0 5 0002 01DEC2018 14DEC2018 1 1 6 0002 01DEC2018 14DEC2018 0 1 7 0003 09JAN2016 25JAN2016 1 0 8 0003 29JAN2018 12FEB2018 0 0
@NewUsrStat wrote:
Hi Tom thank you for your help.
I tried to apply your code but there are cases where dates are duplicated and Index = 0 that with your code become flagged 1 while they will stay = 0.
So you need to actually explain what the rule is for calculating this new variable.
@NewUsrStat wrote:
So if the dates are duplicated and at least for one record Index =1 then also index should be duplicated to be 1 otherwise nothing to do.
So if any of the observations in the group have a 1 then value should be one?
That is essentially a MAX() operation across all of the observations in the group.
proc sql ;
create table want as
select DB.ID
, DB.Admission
, DB.Discharge
, DB.Index
, max(DB.Index) as New_Index
from DB
group by DB.ID
, DB.Admission
;
quit;
Result
Note: You might need to make new test data if you want to actually test this new logic. For example a group of observations with the same ID and admission date where none of them currently have INDEX=1.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.