Good evening, everyone! I hope you can forgive me for the somewhat newbie question, but my Googling abilities have failed to confirm the exact finding for which I'm searching.
I have two datasets which need to be merged specifically by ResponseID. However, they have different variables (the second dataset was recently generated with a new and different set of variables). I had originally been hoping to simply paste the new variables to the older dataset, but as it happens, the ResponseIDs came through in a different order than in the original dataset. Is there a dataset merging function in SAS which would merge the datasets *specifically on ResponseID*. I am thinking that SAS has a merging function which would add the new variables to the existing dataset, and match all values for those variables to the order as they appear in the old dataset. As a note, the datasets have the same ResponseIDs (e.g., 1,2,3,4,5 both exist within the two datasets), but in different orders (i.e., the second dataset was produced in order 4,2,1,5,3). See examples of old dataset, new dataset, and desired dataset.
Old dataset example
ResponseID | Variable 1 | Variable 2 | Variable 3 | Variable 4 | Variable 5 |
1 | 50 | 50 | 50 | 45 | 0 |
2 | 100 | 90 | 80 | 70 | 60 |
3 | 20 | 0 | 0 | 0 | 0 |
4 | 0 | 0 | 0 | 0 | 0 |
5 | 100 | 100 | 100 | 100 | 100 |
New dataset example:
ResponseID | Variable 6 | Variable 7 |
4 | 10 | 10 |
2 | 50 | 25 |
1 | 60 | 30 |
5 | 100 | 100 |
3 | 0 | 0 |
Desired dataset example (but in the actual, non example dataset, this includes 7000 datapoints)
ResponseID | Variable 1 | Variable 2 | Variable 3 | Variable 4 | Variable 5 | Variable 6 | Variable 7 |
1 | 50 | 50 | 50 | 45 | 0 | 60 | 30 |
2 | 100 | 90 | 80 | 70 | 60 | 50 | 25 |
3 | 20 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | 0 | 0 | 0 | 0 | 0 | 10 | 10 |
5 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
Given the tremendous size of the dataset (as well as a slight time crunch), it is not feasible to execute this task manually (as in, finding each unique ResponseID and matching them to the original dataset's order, then importing the data for the new variables in that matched order).
Thanks so much for your time!
This is all you need.
proc sort data=new;
by responseid;
run;
data want;
merge old new;
by responseid;
run;
But its important to know how it works. The doco's pretty good.
Editor's Note: The above code will work by sorting the data; JBailey's code works without requiring a sort first.
This is all you need.
proc sort data=new;
by responseid;
run;
data want;
merge old new;
by responseid;
run;
But its important to know how it works. The doco's pretty good.
Editor's Note: The above code will work by sorting the data; JBailey's code works without requiring a sort first.
What if none of the variable are the same... meaning if we didn't have responsIDs in both table and only one table.
Just a quick note on how SAS handles variables that have the same name, the get overwritten by the last dataset in the MERGE statement.
If you want to keep all variables, rename your variables in each dataset so they're unique.
Otherwise, as indicated, the BY statement will match the observations to merge by responseID
Thanks to you both for your prompt and comprehensive replies.
Reeza, you make a good point about ensuring that all the variable labels are unique. If I execute the code that JerryLeBreton supplied on, for example, the exemplary data set I put in the original post, will it preserve all Variables 1, 2, 3, 4, 5, 6, 7?
I'm hoping it can just add the variables to the end of the dataset, such that Variables 6 and 7 appear after 5.
Yes it would
Hi @Beartato
Just for fun...
data old;
input ResponseID Variable1 Variable2 Variable3 Variable4 Variable5;
cards;
1 50 50 50 45 0
2 100 90 80 70 60
3 20 0 0 0 0
4 0 0 0 0 0
5 100 100 100 100 100
run;
data new;
input ResponseID Variable6 Variable7;
cards;
4 10 10
2 50 25
1 60 30
5 100 100
3 0 0
run;
proc sql;
create table merged_data as
select o.ResponseID, o.Variable1, o.Variable2, o.Variable3, o.Variable4, o.Variable5
, n.Variable6, n.Variable7
from old o,
new n
where o.ResponseID = n.ResponseID
order by o.ResponseID;
quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.