BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
NewUsrStat
Pyrite | Level 9

Hi guys, 

suppose to have the following dataset: 

 

data DB;
  input ID :$20. DATE_OF_BIRTH  DATE_OF_DEATH  MONTH  REASON :$20.;
cards;
0001  1998 2022   .     .
0001    .  2022   1  ILLNESS
0002  1945 2011   .     .
0002    .  2011   12 ACCIDENT
0003  1938 1990   .     .
0003    .  1990   10 ILLNESS
run;

DB results from DATASET1 and DATASET2 that have been "merged" with set function. 

data DATASET1; 
input ID :$20. DATE_OF_BIRTH  DATE_OF_DEATH ;
cards;
0001  1998 2022  
0002  1945 2011 
0003  1938 1990 
run;

data DATASET2; 
input ID :$20. DATE_OF_DEATH MONTH REASON :$20.;
cards;
0001  2022  ILLNESS
0002  2011 ACCIDENT
0003  1990  ILLNESS
run;

Is there a way to get the following? 

 


data DB1;
  input ID :$20. DATE_OF_BIRTH  DATE_OF_DEATH  Month Reason :$20.;
cards;
0001  1998 2022   1  ILLNESS
0001    .  2022   .     .
0002  1945 2011   12 ACCIDENT
0002    .  2011   .     .
0003  1938 1990   10 ILLNESS
0003    .  1990   .     .
run;

In other words I would like to move data of variables Month and Reason at the first row of the ID as shown in DB1. I tried using "first." but without success. 

 

Really thank you in advance

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Why not just MERGE to begin with? I am assuming each of the two datasets have only one observation per ID.

data DATASET1; 
  input ID :$20. DATE_OF_BIRTH  DATE_OF_DEATH ;
cards;
0001  1998 2022  
0002  1945 2011 
0003  1938 1990 
;

data DATASET2; 
  input ID :$20. DATE_OF_DEATH MONTH REASON :$20.;
cards;
0001  2022 1  ILLNESS
0002  2011 12 ACCIDENT
0003  1990 10 ILLNESS
;

data want;
  merge dataset1 dataset2;
  by id ;
run;

Result:

               DATE_OF_    DATE_OF_
Obs     ID       BIRTH       DEATH     MONTH     REASON

 1     0001      1998        2022         1     ILLNESS
 2     0002      1945        2011        12     ACCIDENT
 3     0003      1938        1990        10     ILLNESS

If you really do not have access to the original datasets so that all you have is:

data bad;
  set dataset1 dataset2;
  by id;
run;

Then to convert that dataset you can use the UPDATE statement.  Treating an empty version of the dataset as the starting point and the whole dataset as the transactions to be applied.

data want;
  update bad(obs=0) bad;
  by id;
run;

Which generates the same result as we got when merging.

 

View solution in original post

5 REPLIES 5
Quentin
Super User

Sound like a look-ahead problem, for which I would normally look for one of Mark Keintz's papers on lags and leads, e.g.:  https://support.sas.com/resources/papers/proceedings17/1277-2017.pdf.

 

Do you know you will always have exactly two rows per ID? 

Do you know Date_Of_Birth will always be missing for the row that has the value for Date_Of_Death?

Is there a reason you want to keep two records per ID in the output dataset?

How did you end up with this weird data? It looks like someone tried to merge data with the date of birth and date with the date of death, but accidentally concatenated them.  So it might be better to fix the code that created this data, if that's an option.

 

 

 

BASUG is hosting free webinars ! Check out recordings of our past webinars: https://www.basug.org/videos. Save the date for our in person SAS Blowout on Oct 18 in Cambridge, MA. Registration opens in September.
NewUsrStat
Pyrite | Level 9

Thank you very much for reply. The dataset is too big and I don't have the chance to check to answer to your questions. What I know is that two datasets have been merged using "set". Both have in common variables "ID" and "DATE_of_DEATH". I will edit the post showing the single datasets that have been "merged".

Quentin
Super User

Do you have dataset1 and dataset2?  The DATA step has a MERGE statement.  It looks like instead of using the SET statement to combine them (which 'stacks' them vertically), you should use the MERGE statement (which merges them, horizontally).  With MERGE your output dataset would have three rows, and more variables.

BASUG is hosting free webinars ! Check out recordings of our past webinars: https://www.basug.org/videos. Save the date for our in person SAS Blowout on Oct 18 in Cambridge, MA. Registration opens in September.
Tom
Super User Tom
Super User

Why not just MERGE to begin with? I am assuming each of the two datasets have only one observation per ID.

data DATASET1; 
  input ID :$20. DATE_OF_BIRTH  DATE_OF_DEATH ;
cards;
0001  1998 2022  
0002  1945 2011 
0003  1938 1990 
;

data DATASET2; 
  input ID :$20. DATE_OF_DEATH MONTH REASON :$20.;
cards;
0001  2022 1  ILLNESS
0002  2011 12 ACCIDENT
0003  1990 10 ILLNESS
;

data want;
  merge dataset1 dataset2;
  by id ;
run;

Result:

               DATE_OF_    DATE_OF_
Obs     ID       BIRTH       DEATH     MONTH     REASON

 1     0001      1998        2022         1     ILLNESS
 2     0002      1945        2011        12     ACCIDENT
 3     0003      1938        1990        10     ILLNESS

If you really do not have access to the original datasets so that all you have is:

data bad;
  set dataset1 dataset2;
  by id;
run;

Then to convert that dataset you can use the UPDATE statement.  Treating an empty version of the dataset as the starting point and the whole dataset as the transactions to be applied.

data want;
  update bad(obs=0) bad;
  by id;
run;

Which generates the same result as we got when merging.

 

NewUsrStat
Pyrite | Level 9
Hi Tom, unfortunately I have no access to single datasets, but your help unblocked the situation. Thank you very much!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 471 views
  • 1 like
  • 3 in conversation