Hi!
So I have this problem (using sas universtiy):
I am supposed to merge datasets that have two common variables (id and weight) and look for values that are different for one variable (weight).
Say:
Data sheet1;
input id weight1;
datalines;
1 33
2 44
3 55
4 66
5 77
;
run;
Data sheet2;
input id weight2;
datalines;
1 11
2 44
3 55
4 66
5 88
;
run;
To identify observations 1 and 5 I would run;
Data differences;
merge sheet1 sheet2;
by id;
run;
Data differences;
set differences;
where weight1 ne weight2;
run;
Provided they're propperly sorted.
But I am supposed to run this code:
Data differences;
merge sheet1 sheet2 (in =a);
by id weight;
if a;
run;
And I just don't understand how that's supposed to work. The way I understand it is that the datasets are merged by id and weight for all observations in a and overwriting values that are different with those from sheet2.
Maybe I'm not thinking of the right data structure. My supervisor tried to explain it to me, saying that the merge statement would create missings for values that aren't equal on both datasets.
Can someone fill me in? Thanks's in advance!
Given what you now want, your code is far too complex. To get the desired result you only need:
data desired; merge A B; by id ref_var; run;
Art, CEO, AnalystFinder.com
Since you don't have a variable called weight, that code wouldn't run. Your supervisor may have been suggesting something like:
data differences; merge sheet1 (rename=(weight1=weight) in=a) sheet2 (rename=(weight2=weight) in=b); by id weight; if a and not b; run;
Art, CEO, AnalystFinder.com
If you have other variables exclusively in sheet1 (say x1 y1), and you performed the merge as you presented, then for cases where the BY-variables ID and WEIGHT appear only in sheet2, you would indeed have missing values assigned to X1 and Y1 in the output. For other cases, in which ID and WEIGHT appear in both sheets, you would get the original values found in sheet1. Perhaps that is what your supervisor is referring to.
Your question and the 'supervisor' answer don't appear to align. So either you're misunderstanding the question or her answer.
@Iona wrote:
Hi!
So I have this problem (using sas universtiy):
I am supposed to merge datasets that have two common variables (id and weight) and look for values that are different for one variable (weight).
Say:
Data sheet1; input id weight1; datalines; 1 33 2 44 3 55 4 66 5 77 ; run; Data sheet2; input id weight2; datalines; 1 11 2 44 3 55 4 66 5 88 ; run;
To identify observations 1 and 5 I would run;
Data differences; merge sheet1 sheet2; by id; run; Data differences; set differences; where weight1 ne weight2; run;
Provided they're propperly sorted.
But I am supposed to run this code:
Data differences; merge sheet1 sheet2 (in =a); by id weight; if a; run;
And I just don't understand how that's supposed to work. The way I understand it is that the datasets are merged by id and weight for all observations in a and overwriting values that are different with those from sheet2.
Maybe I'm not thinking of the right data structure. My supervisor tried to explain it to me, saying that the merge statement would create missings for values that aren't equal on both datasets.
Can someone fill me in? Thanks's in advance!
Thanks for your replies!
I think I'm halfway understanding what she wants me to do. So we have two data sets and three variables. Id has equal values in both datasets, but ref_var and value_A (value_B respectively) are different. What she wants me to do is to merge datasets A and B in a way that creates missings vor value_A and value_B where they are different.
The given datasets:
data A;
input id ref_var value_A;
datalines;
1 1 1
1 2 2
2 3 3
3 1 1
;
run;
data B;
input id ref_var value_B;
datalines;
1 1 1
1 3 3
2 1 1
3 1 1
;
run;
What I should have in the end:
data desired;
input id ref_var value_A value_B;
datalines;
1 1 1 1
1 2 2 .
1 3 . 3
2 1 . 1
2 3 3 .
3 1 1 1
;
run;
So this is the template that she gave me but I must be doing it wrong as I am only getting the missings vor value_B but not for value_A.
data differences;
merge A (in =a) B;
by id ref_var;
if a;
if ref_var eq . then delete;
if value_B eq . then output;
run;
Adding "if value_A eq . then output" doesn't help or even change the output. Can someone spot my mistake?
Given what you now want, your code is far too complex. To get the desired result you only need:
data desired; merge A B; by id ref_var; run;
Art, CEO, AnalystFinder.com
That did it, thank you!
Data sheet1;
input id weight1;
datalines;
1 33
2 44
3 55
4 66
5 77
;
run;
Data sheet2;
input id weight2;
datalines;
1 11
2 44
3 55
4 66
5 88
;
run;
proc sort data=sheet1;by id;
proc sort data=sheet2;by id;
run;
data merge_12;
merge sheet1 sheet2;by id;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.