Re: conditional set statement

Elkridge_SAS · Posted 02-04-2011 02:01 PM

data test1;
input x z $;
datalines;
1 A
2 B
3 C
;

data test2;
input x r $;
datalines;
1 R
2 S
3 T
4 W
5 G
;

For the two datasets above, I am trying to concatenate them to produce a third which will only have x variables found on test1. For instance, the kind of thing you would easily do for a merge by using the subsetting if statement. I can't seem to find a way to do the same with set statement or anything else.

For instance, I'll like to obtain an output like this:

x z r
1 A .
1 . R
2 B .
2 . S
3 C .
3 . T

Thanks for any help. Message was edited by: Elkridge_SAS

Peter_C · Posted 02-04-2011 03:00 PM

SET with a BY statement and an IN= to detect the test1 and from that create and retain a flag when test1 is the first.id
Data result ;
Set test1( in= in1) test2;
By X;
If first.X then flag = in1;
Retain flag;
Drop flag;
If in1;
Run;

Elkridge_SAS · Posted 02-04-2011 03:06 PM

No this gives me only 3 records. I want 5. This might help though. I'll try and tweak it. If you notice any corrections that should be made, I'll be glad to receive your help.

Peter_C · Posted 02-04-2011 06:42 PM

Elkridge
sorry for posting untested code.A small change provides what you wanted.
I created FLAG and failed to test it, testing in1 instead
The only change
if flag;
Data result ;
Set test1( in= in1) test2;
By X;
If first.X then flag = in1;
Retain flag;
Drop flag;
If flag;
Run;
option ls=64 nocenter dtreset pageno=1 ;
title 'mea culpa';
proc print;
run;[pre]mea culpa 23:40 Friday, February 4, 2011 1

Obs x z r

1 1 A
2 1 R
3 2 B
4 2 S
5 3 C
6 3 T[/pre]

chang_y_chung_hotmail_com · Posted 02-04-2011 03:12 PM

Then one way is to merge them first, then output twice, missing out one variable at a time.

/* test datasets */

data one(drop=r) two(drop=z);

do until(x = 3);

input x z $ @@;

output one;

end;

input;

do until(x = 5);

input x r $ @@;

output two;

end;

stop;

cards;

1 A 2 B 3 C

1 R 2 S 3 T 4 W 5 G

;

run;

/* sort, merge, then output twice missing out either z or r */

proc sort data=one out=ones nodupkey;

by x;

run;

proc sort data=two out=twos nodupkey;

by x;

run;

data three;

merge ones(in=one) twos(in=two);

by x;

if one and two;

array temp(1) $1 _temporary_;

temp(1) = r;

call missing(r);

output;

r = temp(1);

call missing(z);

output;

run;

/* check */

proc print data=three noobs;

run;

/* on log

x z r

1 A

1 R

2 B

2 S

3 C

3 T

*/

Elkridge_SAS · Posted 02-04-2011 03:15 PM

This does exactly what I wanted. Thanks to all those who helped.

data result(drop=x_keep) ;
set test1(in=in1) test2;
by X;
if first.X and in1 then x_keep = X;
retain x_keep;
if X=x_keep;
run;

ballardw · Posted 02-04-2011 04:47 PM

proc sql;
create table test3 as
select * from test1
outer union corr
select b.*
from (select distinct x from test1) as a left join test2 as b on a.x=b.x

order by x;
quit;

May execute faster on larger datasets.

deleted_user · Posted 02-04-2011 06:07 PM

Hello,

Here is another solution which also accounts (like chang's) for different values for the key variable in both data sets not just in one:

[pre]

data test1;
input x z $;
datalines;
1 A
2 B
3 C
7 G
9 U
;

data test2;
input x r $;
datalines;
1 R
2 S
3 T
4 W
5 G
8 P
9 Q
;

data test3;

retain _x a b;

set test1 (in=t1) test2 (in=t2);
by x;

if t1 then do;
_x=x;c+1;
a=c;
return;
end;

if t2 then do;
d+1;
b=d;
end;

if x ne _x then return;

set test1 point=a ;
call missing(r);
output;
set test2 point=b;
call missing(z);
output;

drop _x c d;

run;

[/pre]

Marius

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away