DATA Step, Macro, Functions and more

Merge three datasets with different number of observations in each

Reply
Contributor
Posts: 25

Merge three datasets with different number of observations in each

Hi,

Is it possible to merge three datasets with different number of observations? Here SUBJECT is the unique identifier and other mentioned variables are only expected to be seen in the  output ( AMT, ASTTIMEPT, PRI)

 

 

 

I tried the below code and see the following in the log.

 
NOTE: MERGE statement has more than one data set with repeats of BY values.
NOTE: There were 178 observations read from the data set WORK.AMT_.
NOTE: There were 720 observations read from the data set WORK.AST_.
NOTE: There were 88 observations read from the data set WORK.PRI_.
NOTE: The data set WORK.DER4 has 704 observations and 11 variables.

 

data amt_;
set amt;
run;
proc sort data=amt_ ;
by SUBJECT AMT;  /*There are two values for each subject, which are not duplicates*/
run;

 

data ast_;
set ast;
run;
proc sort data=ast_ ;
by SUBJECT ASTTIMEPT;   /*There are 8 time points for each subject, which are not duplicates- Say Day0-Day7*/
run;


data pri_;
set pri;
run;
proc sort data=pri_;
by SUBJECT PRI;


data der4;
merge amt_ (in=a) ast_ (in=b) pri_ (in=c);
by UNOSID;
if a;
run;

 

Any suggestions?

Thank you,

 

Regards,

Nasya

Super User
Super User
Posts: 9,599

Re: Merge three datasets with different number of observations in each

You haven't asked any question?  This step:

data der4;
  merge amt_ (in=a) ast_ (in=b) pri_ (in=c);
  by unosid;
  if a;
run;

Is stating to merge the 3 datasets by unosid, and keep records where data appears in amt_, i.e. if a subject only appears in ast_ then it would not appear in the output dataset.  That is what this tells you:

NOTE: MERGE statement has more than one data set with repeats of BY values.
NOTE: There were 178 observations read from the data set WORK.AMT_.
NOTE: There were 720 observations read from the data set WORK.AST_.
NOTE: There were 88 observations read from the data set WORK.PRI_.
NOTE: The data set WORK.DER4 has 704 observations and 11 variables.
 
So there are 6 observations (could be all same subject) which have subject value which does not appear in amt_.
 
Please also avoid shouting in code, and use the code window - its the {i} above post area - to retain formatting.
Contributor
Posts: 25

Re: Merge three datasets with different number of observations in each

Thanks for your response. I have another question.
NOTE: MERGE statement has more than one data set with repeats of BY values. Is this note a bad note and needs to be avoided?
Super User
Super User
Posts: 9,599

Re: Merge three datasets with different number of observations in each

It is telling you have non-distinct by groups in one or more table.  There is lots of information out there on this topic for instance:

https://www.lexjansen.com/nesug/nesug08/cc/cc21.pdf

 

Personally, knowing your data is the most important part of any programming activity, (next to documentation of course).  So your question is actually to yourself, "Is this note a bad note and needs to be avoided?" - only you who know your data can answer this.

Super User
Posts: 6,785

Re: Merge three datasets with different number of observations in each

No, you can't ignore the situation.

 

The first step is forming a plan.  No computers allowed, since it's not a programming problem.

 

When you have two observations in one data set, that should match with eight observations in another data set, what would you like the result to be?  SAS will match them, but it's probably not the match that you would choose.  (Actually SAS matches #1 with #1, then #2 with #2 through #8.)

 

So plan.  What matches would create the proper result?  After that, we can talk about programming.

Ask a Question
Discussion stats
  • 4 replies
  • 63 views
  • 0 likes
  • 3 in conversation