I have datasets like
Obs | Num1 | Num2 | Num3 |
1 | 1 | . | 1 |
2 | 2 | 5 | 2 |
3 | 3 | 5 | . |
4 | 4 | 5 | 4 |
5 | 6 | 7 | 8 |
| Num6 | Num7 |
| 3 | 4 |
| 4 | 2 |
| 6 | . |
| 2 | 6 |
Output should be
1 | . | 1 | 3 | 4 |
2 | 5 | 2 | 4 | 2 |
3 | 5 | . | 6 | . |
4 | 5 | 4 | 2 | 6 |
6 | 7 | 8 |
@HB This is incorrect you can merge without a BY statement if you're doing a side by side merge.
data want;
merge file1 file2;
run;
proc print data=want;run;
SAS Output
The SAS System |
Obs | obs | var1 | var2 | var3 | var4 | var5 |
---|---|---|---|---|---|---|
1 | 1 | 1 | . | 1 | 3 | 4 |
2 | 2 | 2 | 5 | 2 | 4 | 2 |
3 | 3 | 3 | 5 | . | 6 | . |
4 | 4 | 4 | 5 | 4 | 2 | 6 |
5 | 5 | 6 | 7 | 8 | . | . |
just do:
data want;
merge table1 table2;
by;
run;
If I do merge data will be appended after the first dataset, but I want it along with the first dataset
@ujwala wrote:
If I do merge data will be appended after the first dataset, but I want it along with the first dataset
Try it. If it doesn't work, post the code and log. The premise is correct, so if it doesn't work, either the situation is not as described or your code is incorrect.
Your data is going to need a key to be merged with.
I'll assume a variable called obs (which is a bad variable name!)
Doing this:
data file1;
input obs var1 var2 var3;
datalines;
1 1 . 1
2 2 5 2
3 3 5 .
4 4 5 4
5 6 7 8
;
data file2;
input obs var4 var5;
datalines;
1 3 4
2 4 2
3 6 .
4 2 6
;
proc sort data=file1 out=file1_sorted;
by obs;
run;
proc sort data=file2 out= file2_sorted;
by obs;
run;
* the (in = a) and (if a) function like a left join;
data combined;
merge file1_sorted (in = a) file2_sorted;
by obs;
if a;
run;
gives us this:
obs var1 var2 var3 var4 var5 1 1 1 3 4 2 2 5 2 4 2 3 3 5 6 4 4 5 4 2 6 5 6 7 8
HTH.
@HB This is incorrect you can merge without a BY statement if you're doing a side by side merge.
data want;
merge file1 file2;
run;
proc print data=want;run;
SAS Output
The SAS System |
Obs | obs | var1 | var2 | var3 | var4 | var5 |
---|---|---|---|---|---|---|
1 | 1 | 1 | . | 1 | 3 | 4 |
2 | 2 | 2 | 5 | 2 | 4 | 2 |
3 | 3 | 3 | 5 | . | 6 | . |
4 | 4 | 4 | 5 | 4 | 2 | 6 |
5 | 5 | 6 | 7 | 8 | . | . |
@Reeza I don't consider it incorrect. It may be less terse, but the verbosity helps me remember all the components to the statement.
It's also working code.
If you know that your first data set is the one with more observations, you could also code it this way:
data want;
set have1;
if done=0 then set have2 end=done;
output;
call missing (of _all_);
run;
Just merge them, without any by statement.
data want;
merge data1 data2;
run;
May be after all the discussion in this tread you also want to have a read of the docu to get a bit more insight of the options.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.