Dear,
I need to merge dataset one and fail to get dataset three. With my code I am getting only id =2 and 4 in the dataset three.
But some these subjects have been screened with different subject number(preid). I need to exclude if both id and preid have sf='Y'.
if preid has sf='N' and id has sf='Y' then id=preid and output to dataset three.
Please help.Thanks.
data one;
input id term $ preid;
datalines;
1 a 11
2 b 22
3 c 33
4 d
;
data fail;
input id sf$ ;
datalines;
1 Y
2 N
3 Y
4 N
11 N
22 N
33 Y
;
data three;
merge one(in=a) fail(in=b);
by id;
if a;
if SF='Y' then delete;
run;
In my ouput dataset I need all subject with sf='Y'.
output needed;
id term
2 b
4 d
11 a
output getting:
id term
2 b
4 d
Sounds like you want something like:
data three; merge one(in=a) fail(in=b); by id; if a; run; proc sort data=three; by preid; run; data three; merge three(in=a) fail (rename=(id=preid sf=sf2) in=b); by preid; if a; if sf eq 'Y' and sf2 eq 'N' then id=preid; if not (sf eq 'Y' and sf2 eq 'Y'); run;
Art, CEO, AnalystFinder.com
You could make a dataset, call it EXCLUDES, consisting of all the ID/PREID pairs that have double SF='Y':
proc sql;
create table excludes
as select a.id, b.id as preid
from fail as a full join fail a b
on a.sf=b.sf
where a.sf='Y' and a.id<b.id
order by a.id,b.id;
quit;
Then just merge it with dataset ONE and eliminate all obs from ONE that are also found in EXCLUDES:
proc sort data=one;
by id preid;
run;
data three;
merge one excludes (in=inex);
by id preid;
if inex=0;
run;
This code has the advantage, IMO, that it makes the purpose of the program more self-evident.
@knveraraju91 wrote:
Dear,
I need to merge dataset one and fail to get dataset three. With my code I am getting only id =2 and 4 in the dataset three.
But some these subjects have been screened with different subject number(preid). I need to exclude if both id and preid have sf='Y'.
if preid has sf='N' and id has sf='Y' then id=preid and output to dataset three.
Please help.Thanks.
data one;
input id term $ preid;
datalines;
1 a 11
2 b 22
3 c 334 d
;
data fail;
input id sf$ ;
datalines;
1 Y
2 N
3 Y4 N
11 N
22 N
33 Y
;
data three;
merge one(in=a) fail(in=b);
by id;
if a;
if SF='Y' then delete;
run;
In my ouput dataset I need all subject with sf='Y'.
Note that your code I highlighted in red says to Delete when the value is the one you want.
Sorry. That statement should be In my ouput dataset I need all subject with sf='N'.
Subsetting if:
if sf='N';
keeps on those records where the IF statement is true. Note that the location of the IF statement in your code can be very important if you manipulate any of the variables used. If you might have lower case n and what to keep those either upcase the variable earlier in the code or in the comparison:
if upcase(sf)='N';
also if the SF variable might acquire or have leading blanks then you want to strip them as " N" is not equal to "N"
if upcase(strip(sf)) = 'N';
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.