I have datasets like
Obs | Num1 | Num2 | Num3 |
1 | 1 | . | 1 |
2 | 2 | 5 | 2 |
3 | 3 | 5 | . |
4 | 4 | 5 | 4 |
5 | 6 | 7 | 8 |
| Num6 | Num7 |
| 3 | 4 |
| 4 | 2 |
| 6 | . |
| 2 | 6 |
Output should be
1 | . | 1 | 3 | 4 |
2 | 5 | 2 | 4 | 2 |
3 | 5 | . | 6 | . |
4 | 5 | 4 | 2 | 6 |
6 | 7 | 8 |
@HB This is incorrect you can merge without a BY statement if you're doing a side by side merge.
data want;
merge file1 file2;
run;
proc print data=want;run;
SAS Output
The SAS System |
Obs | obs | var1 | var2 | var3 | var4 | var5 |
---|---|---|---|---|---|---|
1 | 1 | 1 | . | 1 | 3 | 4 |
2 | 2 | 2 | 5 | 2 | 4 | 2 |
3 | 3 | 3 | 5 | . | 6 | . |
4 | 4 | 4 | 5 | 4 | 2 | 6 |
5 | 5 | 6 | 7 | 8 | . | . |
just do:
data want;
merge table1 table2;
by;
run;
If I do merge data will be appended after the first dataset, but I want it along with the first dataset
@ujwala wrote:
If I do merge data will be appended after the first dataset, but I want it along with the first dataset
Try it. If it doesn't work, post the code and log. The premise is correct, so if it doesn't work, either the situation is not as described or your code is incorrect.
Your data is going to need a key to be merged with.
I'll assume a variable called obs (which is a bad variable name!)
Doing this:
data file1;
input obs var1 var2 var3;
datalines;
1 1 . 1
2 2 5 2
3 3 5 .
4 4 5 4
5 6 7 8
;
data file2;
input obs var4 var5;
datalines;
1 3 4
2 4 2
3 6 .
4 2 6
;
proc sort data=file1 out=file1_sorted;
by obs;
run;
proc sort data=file2 out= file2_sorted;
by obs;
run;
* the (in = a) and (if a) function like a left join;
data combined;
merge file1_sorted (in = a) file2_sorted;
by obs;
if a;
run;
gives us this:
obs var1 var2 var3 var4 var5 1 1 1 3 4 2 2 5 2 4 2 3 3 5 6 4 4 5 4 2 6 5 6 7 8
HTH.
@HB This is incorrect you can merge without a BY statement if you're doing a side by side merge.
data want;
merge file1 file2;
run;
proc print data=want;run;
SAS Output
The SAS System |
Obs | obs | var1 | var2 | var3 | var4 | var5 |
---|---|---|---|---|---|---|
1 | 1 | 1 | . | 1 | 3 | 4 |
2 | 2 | 2 | 5 | 2 | 4 | 2 |
3 | 3 | 3 | 5 | . | 6 | . |
4 | 4 | 4 | 5 | 4 | 2 | 6 |
5 | 5 | 6 | 7 | 8 | . | . |
@Reeza I don't consider it incorrect. It may be less terse, but the verbosity helps me remember all the components to the statement.
It's also working code.
If you know that your first data set is the one with more observations, you could also code it this way:
data want;
set have1;
if done=0 then set have2 end=done;
output;
call missing (of _all_);
run;
Just merge them, without any by statement.
data want;
merge data1 data2;
run;
May be after all the discussion in this tread you also want to have a read of the docu to get a bit more insight of the options.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.