Hi guys,
suppose to have the following dataset:
data DB;
input ID :$20. Admission :date09. Discharge :date09. Sex;
format Admission date9. Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 M
0001 22FEB2018 03MAR2018
0001 22FEB2018 03MAR2018
0002 01DEC2016 14DEC2016 F
0002 01DEC2016 14DEC2016
0002 25DEC2017 02JAN2018
0002 06JAN2018 09JAN2018
;run;
is there a way to make consistent the variable sex, in other words to get the following dataset?
data DB1;
input ID :$20. Admission :date09. Discharge :date09. Sex;
format Admission date9. Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 M
0001 22FEB2018 03MAR2018 M
0001 22FEB2018 03MAR2018 M
0002 01DEC2016 14DEC2016 F
0002 01DEC2016 14DEC2016 F
0002 25DEC2017 02JAN2018 F
0002 06JAN2018 09JAN2018 F
;run;
Note that replicated dates are there for a reason.
Thank you in advance!
Hi @NewUsrStat, assuming Sex is always non-missing for the first instance of ID, you'll want to create a new variable (New_Sex) that retains the Sex value for all observations for an ID.
data db1;
retain New_Sex;
set db;
by id;
if first.id then New_Sex=Sex;
run;
Hi @NewUsrStat, assuming Sex is always non-missing for the first instance of ID, you'll want to create a new variable (New_Sex) that retains the Sex value for all observations for an ID.
data db1;
retain New_Sex;
set db;
by id;
if first.id then New_Sex=Sex;
run;
Your first data set does not create a usable data set as 1) Sex is read as numeric, hence all missing and 2) read past the end of the line because of the missing values in the SEX column.
The $ after Sex on the input statement fixes the first problem, TRUNCOVER on the INFILE the other by stopping reading when running out of values.
data work.DB; infile datalines truncover; input ID :$20. Admission :date09. Discharge :date09. Sex $; format Admission date9. Discharge date9.; cards; 0001 13JAN2017 25JAN2017 M 0001 22FEB2018 03MAR2018 0001 22FEB2018 03MAR2018 0002 01DEC2016 14DEC2016 F 0002 01DEC2016 14DEC2016 0002 25DEC2017 02JAN2018 0002 06JAN2018 09JAN2018 ;run;
One way:
Proc sql; create table work.db1 as select a.id, a.admission, a.discharge ,b.sex from work.db as a left join (select distinct id,sex from work.db where not missing(sex) ) as b on a.id=b.id ; quit;
If you data is not in sorted order on input this might change that.
@NewUsrStat, I should also mention that running your DATA step to create DB didn't work. By default, when you reach the end of the line and there are missing values for the remainder of the variables, SAS will continue reading from the next line. Hence the message below: "SAS went to a new line when INPUT statement reached past the end of the line".
You'll want to add in the option TRUNCOVER like so:
data DB;
infile cards truncover;
input ID :$20. Admission :date09. Discharge :date09. Sex $;
format Admission date9. Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 M
0001 22FEB2018 03MAR2018
0001 22FEB2018 03MAR2018
0002 01DEC2016 14DEC2016 F
0002 01DEC2016 14DEC2016
0002 25DEC2017 02JAN2018
0002 06JAN2018 09JAN2018
;
run;
Not sure what you want to do when there are two different values.
But a simple merge of dataset with itself should work fine.
data DB;
input ID :$20. Admission :date. Discharge :date. Sex $;
format Admission Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 M
0001 22FEB2018 03MAR2018 .
0001 22FEB2018 03MAR2018 .
0002 01DEC2016 14DEC2016 F
0002 01DEC2016 14DEC2016 .
0002 25DEC2017 02JAN2018 .
0002 06JAN2018 09JAN2018 .
;
data want ;
merge db(where=(not missing(sex))) db(drop=sex);
by id;
run;
Result
Obs ID Admission Discharge Sex 1 0001 13JAN2017 25JAN2017 M 2 0001 22FEB2018 03MAR2018 M 3 0001 22FEB2018 03MAR2018 M 4 0002 01DEC2016 14DEC2016 F 5 0002 01DEC2016 14DEC2016 F 6 0002 25DEC2017 02JAN2018 F 7 0002 06JAN2018 09JAN2018 F
data DB;
input ID :$20. Admission :date. Discharge :date. Sex $;
format Admission Discharge date9.;
cards;
0001 13JAN2017 25JAN2017 M
0001 22FEB2018 03MAR2018 .
0001 22FEB2018 03MAR2018 .
0002 01DEC2016 14DEC2016 F
0002 01DEC2016 14DEC2016 .
0002 25DEC2017 02JAN2018 .
0002 06JAN2018 09JAN2018 .
;
data want ;
update db(obs=0) db;
by id;
output;
run;
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.