BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
NewUsrStat
Lapis Lazuli | Level 10

Hi guys, 

suppose to have the following dataset: 

 

data DB;
  input ID :$20. Admission :date09. Discharge :date09. Sex;
  format Admission date9. Discharge date9.;
cards;
0001  13JAN2017 25JAN2017  M
0001  22FEB2018 03MAR2018  
0001  22FEB2018 03MAR2018 
0002  01DEC2016 14DEC2016  F
0002  01DEC2016 14DEC2016  
0002  25DEC2017 02JAN2018 
0002  06JAN2018 09JAN2018  
;run;

is there a way to make consistent the variable sex, in other words to get the following dataset? 

 

 

 

data DB1;
  input ID :$20. Admission :date09. Discharge :date09. Sex;
  format Admission date9. Discharge date9.;
cards;
0001  13JAN2017 25JAN2017  M
0001  22FEB2018 03MAR2018  M 
0001  22FEB2018 03MAR2018  M
0002  01DEC2016 14DEC2016  F
0002  01DEC2016 14DEC2016  F
0002  25DEC2017 02JAN2018  F
0002  06JAN2018 09JAN2018  F
;run;

Note that replicated dates are there for a reason. 

 

Thank you in advance!

 

1 ACCEPTED SOLUTION

Accepted Solutions
antonbcristina
SAS Super FREQ

Hi @NewUsrStat, assuming Sex is always non-missing for the first instance of ID, you'll want to create a new variable (New_Sex) that retains the Sex value for all observations for an ID.

 

data db1;
	retain New_Sex;
	set db;
	by id;

	if first.id then New_Sex=Sex;
run;

 

View solution in original post

5 REPLIES 5
antonbcristina
SAS Super FREQ

Hi @NewUsrStat, assuming Sex is always non-missing for the first instance of ID, you'll want to create a new variable (New_Sex) that retains the Sex value for all observations for an ID.

 

data db1;
	retain New_Sex;
	set db;
	by id;

	if first.id then New_Sex=Sex;
run;

 

ballardw
Super User

Your first data set does not create a usable data set as 1) Sex is read as numeric, hence all missing and 2) read past the end of the line because of the missing values in the SEX column.

The $ after Sex on the input statement fixes the first problem, TRUNCOVER on the INFILE the other by stopping reading when running out of values.

data work.DB;
  infile datalines truncover;
  input ID :$20. Admission :date09. Discharge :date09. Sex $;
  format Admission date9. Discharge date9.;
cards;
0001  13JAN2017 25JAN2017  M
0001  22FEB2018 03MAR2018  
0001  22FEB2018 03MAR2018 
0002  01DEC2016 14DEC2016  F
0002  01DEC2016 14DEC2016  
0002  25DEC2017 02JAN2018 
0002  06JAN2018 09JAN2018  
;run;

One way:

Proc sql;
   create table work.db1 as
   select a.id, a.admission, a.discharge
          ,b.sex
   from   work.db as a
          left join
          (select distinct id,sex from work.db
           where not missing(sex) ) as b
           on a.id=b.id
   ;
quit;

If you data is not in sorted order on input this might change that.

antonbcristina
SAS Super FREQ

@NewUsrStat, I should also mention that running your DATA step to create DB didn't work. By default, when you reach the end of the line and there are missing values for the remainder of the variables, SAS will continue reading from the next line. Hence the message below: "SAS went to a new line when INPUT statement reached past the end of the line". 

 

antonbcristina_0-1728664388204.png

 

You'll want to add in the option TRUNCOVER like so:

 

data DB;
  infile cards truncover;
  input ID :$20. Admission :date09. Discharge :date09. Sex $;
  format Admission date9. Discharge date9.;
cards;
0001  13JAN2017 25JAN2017  M
0001  22FEB2018 03MAR2018  
0001  22FEB2018 03MAR2018 
0002  01DEC2016 14DEC2016  F
0002  01DEC2016 14DEC2016  
0002  25DEC2017 02JAN2018 
0002  06JAN2018 09JAN2018  
;
run;

 

Tom
Super User Tom
Super User

Not sure what you want to do when there are two different values.

But a simple merge of dataset with itself should work fine.

data DB;
  input ID :$20. Admission :date. Discharge :date. Sex $;
  format Admission Discharge date9.;
cards;
0001  13JAN2017 25JAN2017  M
0001  22FEB2018 03MAR2018  .
0001  22FEB2018 03MAR2018  .
0002  01DEC2016 14DEC2016  F
0002  01DEC2016 14DEC2016  .
0002  25DEC2017 02JAN2018  .
0002  06JAN2018 09JAN2018  .
;

data want ;
  merge db(where=(not missing(sex))) db(drop=sex);
  by id;
run;

Result

Obs     ID     Admission    Discharge    Sex

 1     0001    13JAN2017    25JAN2017     M
 2     0001    22FEB2018    03MAR2018     M
 3     0001    22FEB2018    03MAR2018     M
 4     0002    01DEC2016    14DEC2016     F
 5     0002    01DEC2016    14DEC2016     F
 6     0002    25DEC2017    02JAN2018     F
 7     0002    06JAN2018    09JAN2018     F
Ksharp
Super User
data DB;
  input ID :$20. Admission :date. Discharge :date. Sex $;
  format Admission Discharge date9.;
cards;
0001  13JAN2017 25JAN2017  M
0001  22FEB2018 03MAR2018  .
0001  22FEB2018 03MAR2018  .
0002  01DEC2016 14DEC2016  F
0002  01DEC2016 14DEC2016  .
0002  25DEC2017 02JAN2018  .
0002  06JAN2018 09JAN2018  .
;

data want ;
  update db(obs=0) db;
  by id;
  output;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1337 views
  • 1 like
  • 5 in conversation