BookmarkSubscribeRSS Feed
mmaxiu
Calcite | Level 5

Hi, everyone. 

 

I'm trying to merge a smaller dataset into a larger dataset with repetition of the values of a specific variable from the small dataset. I want to take the smallclient variable from the first table below and add that variable with repetition to the second table below for each combination of au and fyear. I tried to code this with the following, but it failed. How would I code this?

 

*merge with repeated values - first example failed testgroup2 is the larger dataset;

data Practice.merged_testgroup;
merge Practice.testgroup2(in=a) Practice.testgroup5(in=b) / repeated;
by au fyear;
run;

mmaxiu_0-1714505936383.png

 

mmaxiu_1-1714506149355.png

 

 

6 REPLIES 6
Tom
Super User Tom
Super User

No idea what "with repetition" means.  And the photographs of data don't help explain it.

 

Are you just doing a simple 1 to Many merge?  Does the combination of AU and YEAR uniquely identify the observations in at least one of the datasets?  If so data step merge is what you want.

 

Do you want to do a many to many merge?  If so then you probably want to use PROC SQL.

 

If it is something else then explain, with examples. And share your example data as text, not photographs.

 

mmaxiu
Calcite | Level 5

Hi, Tom. 

 

Thanks for your reply and the suggestions! Here's an example to clarify. Based on the tables above, I want to take the value of smallclient in row 1 of the smaller table and show its value for all combinations of au = 0 and fyear = 2011 in the larger table. So, the output I'm looking for in a merged table is for rows 1 through 15 from the larger table to have the variable smallclient show as zero for every single row. This would then be repeated for the whole larger table of 6841 rows. 

Tom
Super User Tom
Super User

@mmaxiu wrote:

Hi, Tom. 

 

Thanks for your reply and the suggestions! Here's an example to clarify. Based on the tables above, I want to take the value of smallclient in row 1 of the smaller table and show its value for all combinations of au = 0 and fyear = 2011 in the larger table. So, the output I'm looking for in a merged table is for rows 1 through 15 from the larger table to have the variable smallclient show as zero for every single row. This would then be repeated for the whole larger table of 6841 rows. 


That is exactly what a simple MERGE will do.  As long as SMALLCLIENT does NOT already exist as a variable in the second table.

data one;
  input id lookup $;
cards;
1 A
2 B
3 C
;

data many ;
  input id other $;
cards;
1 X
1 Y
1 Z
2 W
4 V
;

data want;
  merge many one ;
  by id;
run;

Result

Obs    id    other    lookup

 1      1      X        A
 2      1      Y        A
 3      1      Z        A
 4      2      W        B
 5      3               C
 6      4      V
sbxkoenk
SAS Super FREQ

Pictures are extremely poor ways to share data. We cannot write code against pictures. At least paste text. I am too lazy to retype pictures. Ideally, provide us with the data in data step format. 

 

The MERGE statement has no options behind a forward slash. Where did you get that REPEATED option from?
SAS® 9.4 and SAS® Viya® 3.5 Programming Documentation
DATA Step Statements -- MERGE Statement

https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/n1i8w2bwu1fn5kn1gpxj18xttbb0.ht...

 

But I don't see the problem.
The requested repetition is there automatically as the by-variables (merge keys) match.

 

Try this and check the result (!) :

data Practice.merged_testgroup;
  merge Practice.testgroup2 Practice.testgroup5;
  by au fyear;
run;

Koen

mmaxiu
Calcite | Level 5

Thanks, Koen! I tried the code you suggested, but what happens is that the merged table only contains the same number of rows as the smaller table, not the larger one. 

Quentin
Super User

@mmaxiu wrote:

Thanks, Koen! I tried the code you suggested, but what happens is that the merged table only contains the same number of rows as the smaller table, not the larger one. 


Koen's code as posted could not subset the rows to be only the rows from the smaller table.  Please show the full log from running that step.  The log will show the actual code ran, the number of records read from each dataset, and the number of records written.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 361 views
  • 2 likes
  • 4 in conversation