BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Beartato
Fluorite | Level 6

Good evening, everyone! I hope you can forgive me for the somewhat newbie question, but my Googling abilities have failed to confirm the exact finding for which I'm searching. 

 

I have two datasets which need to be merged specifically by ResponseID. However, they have different variables (the second dataset was recently generated with a new and different set of variables). I had originally been hoping to simply paste the new variables to the older dataset, but as it happens, the ResponseIDs came through in a different order than in the original dataset. Is there a dataset merging function in SAS which would merge the datasets *specifically on ResponseID*. I am thinking that SAS has a merging function which would add the new variables to the existing dataset, and match all values for those variables to the order as they appear in the old dataset. As a note, the datasets have the same ResponseIDs (e.g., 1,2,3,4,5 both exist within the two datasets), but in different orders (i.e., the second dataset was produced in order 4,2,1,5,3). See examples of old dataset, new dataset, and desired dataset.

 

Old dataset example

 

ResponseIDVariable 1Variable 2Variable 3Variable 4Variable 5
1505050450
210090807060
3200000
400000
5100100100100100

 

New dataset example:

 

ResponseIDVariable 6Variable 7
41010
25025
16030
5100100
300

 

Desired dataset example (but in the actual, non example dataset, this includes 7000 datapoints)

 

ResponseIDVariable 1Variable 2Variable 3Variable 4Variable 5Variable 6Variable 7
15050504506030
2100908070605025
320000000
4000001010
5100100100100100100100

 

Given the tremendous size of the dataset (as well as a slight time crunch), it is not feasible to execute this task manually (as in, finding each unique ResponseID and matching them to the original dataset's order, then importing the data for the new variables in that matched order).

 

Thanks so much for your time!

1 ACCEPTED SOLUTION

Accepted Solutions
JerryLeBreton
Pyrite | Level 9

This is all you need.

 

proc sort data=new;

by responseid;

run;

data want;

merge old new;

by responseid;

run;

 

But its important to know how it works.  The doco's pretty good.

 

Editor's Note: The above code will work by sorting the data;  JBailey's code works without requiring a sort first.

View solution in original post

6 REPLIES 6
JerryLeBreton
Pyrite | Level 9

This is all you need.

 

proc sort data=new;

by responseid;

run;

data want;

merge old new;

by responseid;

run;

 

But its important to know how it works.  The doco's pretty good.

 

Editor's Note: The above code will work by sorting the data;  JBailey's code works without requiring a sort first.

fritzgerald
Calcite | Level 5

What if none of the variable are the same... meaning if we didn't have responsIDs in both table and only one table.

Reeza
Super User

Just a quick note on how SAS handles variables that have the same name, the get overwritten by the last dataset in the MERGE statement. 

 

If you want to keep all variables, rename your variables in each dataset so they're unique. 

 

Otherwise, as indicated, the BY statement will match the observations to merge by responseID

Beartato
Fluorite | Level 6

Thanks to you both for your prompt and comprehensive replies. 

 

Reeza, you make a good point about ensuring that all the variable labels are unique. If I execute the code that JerryLeBreton supplied on, for example, the exemplary data set I put in the original post, will it preserve all Variables 1, 2, 3, 4, 5, 6, 7?

 

I'm hoping it can just add the variables to the end of the dataset, such that Variables 6 and 7 appear after 5. 

JBailey
Barite | Level 11

Hi @Beartato

 

Just for fun...

 

data old;
   input ResponseID Variable1 Variable2 Variable3 Variable4 Variable5;
cards;
1	50	50	50	45	0
2	100	90	80	70	60
3	20	0	0	0	0
4	0	0	0	0	0
5	100	100	100	100	100
run;

data new;
   input ResponseID Variable6 Variable7;
cards;
4	10	10
2	50	25
1	60	30
5	100	100
3	0	0
run;

proc sql;
   create table merged_data as 
      select o.ResponseID, o.Variable1, o.Variable2, o.Variable3, o.Variable4, o.Variable5
, n.Variable6, n.Variable7 from old o, new n
where o.ResponseID = n.ResponseID order by o.ResponseID; quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 13087 views
  • 3 likes
  • 5 in conversation