BookmarkSubscribeRSS Feed
Nasya
Obsidian | Level 7

Hi,

Is it possible to merge three datasets with different number of observations? Here SUBJECT is the unique identifier and other mentioned variables are only expected to be seen in the  output ( AMT, ASTTIMEPT, PRI)

 

 

 

I tried the below code and see the following in the log.

 
NOTE: MERGE statement has more than one data set with repeats of BY values.
NOTE: There were 178 observations read from the data set WORK.AMT_.
NOTE: There were 720 observations read from the data set WORK.AST_.
NOTE: There were 88 observations read from the data set WORK.PRI_.
NOTE: The data set WORK.DER4 has 704 observations and 11 variables.

 

data amt_;
set amt;
run;
proc sort data=amt_ ;
by SUBJECT AMT;  /*There are two values for each subject, which are not duplicates*/
run;

 

data ast_;
set ast;
run;
proc sort data=ast_ ;
by SUBJECT ASTTIMEPT;   /*There are 8 time points for each subject, which are not duplicates- Say Day0-Day7*/
run;


data pri_;
set pri;
run;
proc sort data=pri_;
by SUBJECT PRI;


data der4;
merge amt_ (in=a) ast_ (in=b) pri_ (in=c);
by UNOSID;
if a;
run;

 

Any suggestions?

Thank you,

 

Regards,

Nasya

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

You haven't asked any question?  This step:

data der4;
  merge amt_ (in=a) ast_ (in=b) pri_ (in=c);
  by unosid;
  if a;
run;

Is stating to merge the 3 datasets by unosid, and keep records where data appears in amt_, i.e. if a subject only appears in ast_ then it would not appear in the output dataset.  That is what this tells you:

NOTE: MERGE statement has more than one data set with repeats of BY values.
NOTE: There were 178 observations read from the data set WORK.AMT_.
NOTE: There were 720 observations read from the data set WORK.AST_.
NOTE: There were 88 observations read from the data set WORK.PRI_.
NOTE: The data set WORK.DER4 has 704 observations and 11 variables.
 
So there are 6 observations (could be all same subject) which have subject value which does not appear in amt_.
 
Please also avoid shouting in code, and use the code window - its the {i} above post area - to retain formatting.
Nasya
Obsidian | Level 7
Thanks for your response. I have another question.
NOTE: MERGE statement has more than one data set with repeats of BY values. Is this note a bad note and needs to be avoided?
RW9
Diamond | Level 26 RW9
Diamond | Level 26

It is telling you have non-distinct by groups in one or more table.  There is lots of information out there on this topic for instance:

https://www.lexjansen.com/nesug/nesug08/cc/cc21.pdf

 

Personally, knowing your data is the most important part of any programming activity, (next to documentation of course).  So your question is actually to yourself, "Is this note a bad note and needs to be avoided?" - only you who know your data can answer this.

Astounding
PROC Star

No, you can't ignore the situation.

 

The first step is forming a plan.  No computers allowed, since it's not a programming problem.

 

When you have two observations in one data set, that should match with eight observations in another data set, what would you like the result to be?  SAS will match them, but it's probably not the match that you would choose.  (Actually SAS matches #1 with #1, then #2 with #2 through #8.)

 

So plan.  What matches would create the proper result?  After that, we can talk about programming.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2393 views
  • 0 likes
  • 3 in conversation