Hi guys,
suppose to have the following dataset DB. Is there a way to get DB1? In other words where there are repeated (identical) dates the Index should be 1. NB: all IDs have only one record (row) where Index = 1.
Thank you in advance
data DB;
input ID :$20. Admission :date09. Discharge :date09. Index;
format Admission date9. Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 0
0001 13JAN2017 25JAN2017 1
0001 22FEB2018 03MAR2018 0
0001 30JAN2019 04MAR2019 0
0002 01DEC2018 14DEC2018 1
0002 01DEC2018 14DEC2018 0
0003 09JAN2016 25JAN2016 1
0003 29JAN2018 12FEB2018 0;
run;
data DB1;
input ID :$20. Admission :date09. Discharge :date09. Index;
format Admission date9. Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 1
0001 13JAN2017 25JAN2017 1
0001 22FEB2018 03MAR2018 0
0001 30JAN2019 04MAR2019 0
0002 01DEC2018 14DEC2018 1
0002 01DEC2018 14DEC2018 1
0003 09JAN2016 25JAN2016 1
0003 29JAN2018 12FEB2018 0;
run;
@NewUsrStat wrote:
So if the dates are duplicated and at least for one record Index =1 then also index should be duplicated to be 1 otherwise nothing to do.
So if any of the observations in the group have a 1 then value should be one?
That is essentially a MAX() operation across all of the observations in the group.
proc sql ;
create table want as
select DB.ID
, DB.Admission
, DB.Discharge
, DB.Index
, max(DB.Index) as New_Index
from DB
group by DB.ID
, DB.Admission
;
quit;
Result
Note: You might need to make new test data if you want to actually test this new logic. For example a group of observations with the same ID and admission date where none of them currently have INDEX=1.
Are you saying you want to new variable to be TRUE when there are multiple observations for the same ID and admission date?
data DB;
input ID :$20. Admission :date09. Discharge :date09. Index;
format Admission Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 0
0001 13JAN2017 25JAN2017 1
0001 22FEB2018 03MAR2018 0
0001 30JAN2019 04MAR2019 0
0002 01DEC2018 14DEC2018 1
0002 01DEC2018 14DEC2018 0
0003 09JAN2016 25JAN2016 1
0003 29JAN2018 12FEB2018 0
;
data want;
set db ;
by id admission ;
new_flag = not (first.admission and last.admission);
run;
Result
Obs ID Admission Discharge Index new_flag 1 0001 13JAN2017 25JAN2017 0 1 2 0001 13JAN2017 25JAN2017 1 1 3 0001 22FEB2018 03MAR2018 0 0 4 0001 30JAN2019 04MAR2019 0 0 5 0002 01DEC2018 14DEC2018 1 1 6 0002 01DEC2018 14DEC2018 0 1 7 0003 09JAN2016 25JAN2016 1 0 8 0003 29JAN2018 12FEB2018 0 0
@NewUsrStat wrote:
Hi Tom thank you for your help.
I tried to apply your code but there are cases where dates are duplicated and Index = 0 that with your code become flagged 1 while they will stay = 0.
So you need to actually explain what the rule is for calculating this new variable.
@NewUsrStat wrote:
So if the dates are duplicated and at least for one record Index =1 then also index should be duplicated to be 1 otherwise nothing to do.
So if any of the observations in the group have a 1 then value should be one?
That is essentially a MAX() operation across all of the observations in the group.
proc sql ;
create table want as
select DB.ID
, DB.Admission
, DB.Discharge
, DB.Index
, max(DB.Index) as New_Index
from DB
group by DB.ID
, DB.Admission
;
quit;
Result
Note: You might need to make new test data if you want to actually test this new logic. For example a group of observations with the same ID and admission date where none of them currently have INDEX=1.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.