- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
input x z $;
datalines;
1 A
2 B
3 C
;
data test2;
input x r $;
datalines;
1 R
2 S
3 T
4 W
5 G
;
For the two datasets above, I am trying to concatenate them to produce a third which will only have x variables found on test1. For instance, the kind of thing you would easily do for a merge by using the subsetting if statement. I can't seem to find a way to do the same with set statement or anything else.
For instance, I'll like to obtain an output like this:
x z r
1 A .
1 . R
2 B .
2 . S
3 C .
3 . T
Thanks for any help. Message was edited by: Elkridge_SAS
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Data result ;
Set test1( in= in1) test2;
By X;
If first.X then flag = in1;
Retain flag;
Drop flag;
If in1;
Run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
sorry for posting untested code.A small change provides what you wanted.
I created FLAG and failed to test it, testing in1 instead
The only change
if flag;
Data result ;
Set test1( in= in1) test2;
By X;
If first.X then flag = in1;
Retain flag;
Drop flag;
If flag;
Run;
option ls=64 nocenter dtreset pageno=1 ;
title 'mea culpa';
proc print;
run;[pre]mea culpa 23:40 Friday, February 4, 2011 1
Obs x z r
1 1 A
2 1 R
3 2 B
4 2 S
5 3 C
6 3 T[/pre]
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
/* test datasets */
data one(drop=r) two(drop=z);
do until(x = 3);
input x z $ @@;
output one;
end;
input;
do until(x = 5);
input x r $ @@;
output two;
end;
stop;
cards;
1 A 2 B 3 C
1 R 2 S 3 T 4 W 5 G
;
run;
/* sort, merge, then output twice missing out either z or r */
proc sort data=one out=ones nodupkey;
by x;
run;
proc sort data=two out=twos nodupkey;
by x;
run;
data three;
merge ones(in=one) twos(in=two);
by x;
if one and two;
array temp(1) $1 _temporary_;
temp(1) = r;
call missing(r);
output;
r = temp(1);
call missing(z);
output;
run;
/* check */
proc print data=three noobs;
run;
/* on log
x z r
1 A
1 R
2 B
2 S
3 C
3 T
*/
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data result(drop=x_keep) ;
set test1(in=in1) test2;
by X;
if first.X and in1 then x_keep = X;
retain x_keep;
if X=x_keep;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
create table test3 as
select * from test1
outer union corr
select b.*
from (select distinct x from test1) as a left join test2 as b on a.x=b.x
order by x;
quit;
May execute faster on larger datasets.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here is another solution which also accounts (like chang's) for different values for the key variable in both data sets not just in one:
[pre]
data test1;
input x z $;
datalines;
1 A
2 B
3 C
7 G
9 U
;
data test2;
input x r $;
datalines;
1 R
2 S
3 T
4 W
5 G
8 P
9 Q
;
data test3;
retain _x a b;
set test1 (in=t1) test2 (in=t2);
by x;
if t1 then do;
_x=x;c+1;
a=c;
return;
end;
if t2 then do;
d+1;
b=d;
end;
if x ne _x then return;
set test1 point=a ;
call missing(r);
output;
set test2 point=b;
call missing(z);
output;
drop _x c d;
run;
[/pre]
Marius