DATA Step, Macro, Functions and more

How to keep required OBS while merging datasets

Reply
Super Contributor
Posts: 272

How to keep required OBS while merging datasets

Dear,

 

I need to merge dataset one and fail to get dataset three. With my code I am getting only id =2  and 4 in the dataset three.

 

But some these subjects have been screened with different subject number(preid). I need to exclude if both id and preid have sf='Y'.

if preid has sf='N' and id has sf='Y' then id=preid and output to dataset three.

Please help.Thanks.

 

data one;
input id term $ preid;
datalines;
1 a 11
2 b 22
3 c 33

4 d
;
data fail;
input id sf$ ;
datalines;
1 Y
2 N
3 Y

4 N
11 N
22 N
33 Y
;
data three;
merge one(in=a) fail(in=b);
by id;
if a;
if SF='Y' then delete;
run;

 

In my ouput dataset I need all subject with sf='Y'.

 

output needed;

id  term

2     b

4      d

11    a

 

output getting:

id   term

2     b

4      d

PROC Star
Posts: 7,363

Re: How to keep required OBS while merging datasets

Sounds like you want something like:

 

data three;
  merge one(in=a) fail(in=b);
  by id;
  if a;
run;

proc sort data=three;
  by preid;
run;

data three;
  merge three(in=a) fail (rename=(id=preid sf=sf2) in=b);
  by preid;
  if a;
  if sf eq 'Y' and sf2 eq 'N' then id=preid;
  if not (sf eq 'Y' and sf2 eq 'Y');
run;

Art, CEO, AnalystFinder.com

Valued Guide
Posts: 797

Re: How to keep required OBS while merging datasets

You could make a dataset, call it EXCLUDES, consisting of all the ID/PREID pairs that have double SF='Y':

 

proc sql;

  create table excludes

  as select a.id, b.id as preid

  from fail as a     full join   fail a b

  on a.sf=b.sf

  where a.sf='Y' and a.id<b.id

  order by a.id,b.id;

quit;

 

Then just merge it with dataset ONE and eliminate all obs from ONE that are also found in EXCLUDES:

 

proc sort data=one;

  by id preid;

run;

data three;

  merge one   excludes (in=inex);

  by id preid;

  if inex=0;

run;

 

 

This code has the advantage, IMO, that it makes the purpose of the program more self-evident.

Super User
Posts: 10,500

Re: How to keep required OBS while merging datasets


knveraraju91 wrote:

Dear,

 

I need to merge dataset one and fail to get dataset three. With my code I am getting only id =2  and 4 in the dataset three.

 

But some these subjects have been screened with different subject number(preid). I need to exclude if both id and preid have sf='Y'.

if preid has sf='N' and id has sf='Y' then id=preid and output to dataset three.

Please help.Thanks.

 

data one;
input id term $ preid;
datalines;
1 a 11
2 b 22
3 c 33

4 d
;
data fail;
input id sf$ ;
datalines;
1 Y
2 N
3 Y

4 N
11 N
22 N
33 Y
;
data three;
merge one(in=a) fail(in=b);
by id;
if a;
if SF='Y' then delete;
run;

 

In my ouput dataset I need all subject with sf='Y'.


Note that your code I highlighted in red says to Delete when the value is the one you want.

Super Contributor
Posts: 272

Re: How to keep required OBS while merging datasets

Sorry. That statement should be In my ouput dataset I need all subject with sf='N'.

Super User
Posts: 10,500

Re: How to keep required OBS while merging datasets

Subsetting if:

 

if sf='N';

 

keeps on those records where the IF statement is true. Note that the location of the IF statement in your code can be very important if you manipulate any of the variables used. If you might have lower case n and what to keep those either upcase the variable earlier in the code or in the comparison:

if upcase(sf)='N';

also if the SF variable might acquire or have leading blanks then you want to strip them as " N" is not equal to "N"

if upcase(strip(sf)) = 'N'; 

Ask a Question
Discussion stats
  • 5 replies
  • 119 views
  • 3 likes
  • 4 in conversation