DATA Step, Macro, Functions and more

Find if a group contains a value to create new variable (SAS 9.4)

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

Find if a group contains a value to create new variable (SAS 9.4)

I want to find if a group has a certain value and delete the group if it does.

I have: 

WBabc
WBasdf
WBcwe
APsdf
APxxxxxx
APter
TNasdf
TNad
TNsdr
TNwer
ODsser
ODwry
ODgyu
ODxxxxxx

 

I want to find if a group has 'xxxxxx' vaue and delete the entire group. I want: 

WBabc
WBasdf
WBcwe
TNasdf
TNad
TNsdr
TNwer

 

Please advice. 


Accepted Solutions
Solution
‎12-27-2017 03:49 PM
PROC Star
Posts: 8,149

Re: Find if a group contains a value to create new variable (SAS 9.4)

Posted in reply to putteringpluie

Two passes are needed in both the datastep and with proc sql, first to identify whether any record in a by group meets the condition, then a second to actually select the records.

 

The solution I suggested uses what is called a DOW loop. It takes over the normal record by record processing normally accomplished by SAS in a datastep.

 

The first part:

data want;
  do until (last.group);
    set have;
    by group notsorted;
    if value='xxxxxx' then drop=1;
  end;

goes through each by group, identifies if any of its records contain 'xxxxxx' for the variable value and, if at least one does, it sets the value of drop to equal 1.

 

The second part:

  do until (last.group);
    set have;
    by group notsorted;
    if not drop then output;
  end;
run;

is run as soon as the last record in a by group is read and only works on that by group. The value of drop is available throughout the processing of the by group. When it get to the last record of a by group it then passes control back to the first part of the DOW loop to begin processing the next group. When it does so, the value of drop is initially set to missing for the next by group.

 

A lot more is going on and can be done using a DOW loop. You can read about it at: support.sas.com/resources/papers/proceedings12/052-2012.pdf and support.sas.com/resources/papers/proceedings12/156-2012.pdf

 

I find the DOW loop to be more versatile than trying to do the same thing in SQL and it runs more than twice as fast as its SQL alternative.

 

Art, CEO, AnalystFinder.com

 

View solution in original post


All Replies
PROC Star
Posts: 1,449

Re: Find if a group contains a value to create new variable (SAS 9.4)

Posted in reply to putteringpluie
What code have you tried? Are you looking for a DATA step solution or also open to PROC SQL solution?
Occasional Contributor
Posts: 9

Re: Find if a group contains a value to create new variable (SAS 9.4)

I'm new to SAS as I'm coming from the world of stata. My logic in stata would be generate a tag of 1 if a group has the certain value. Use "egen" to create another variable which is the max value in a group. And delete if the group max value is 1. I don't know which function would generate the max value by group in SAS. Proc sql solution will also be helpful. 
PROC Star
Posts: 1,584

Re: Find if a group contains a value to create new variable (SAS 9.4)

Posted in reply to putteringpluie

data have;
input grp $ value $;
datalines;
WB abc
WB asdf
WB cwe
AP sdf
AP xxxxxx
AP ter
TN asdf
TN ad
TN sdr
TN wer
OD sser
OD wry
OD gyu
OD xxxxxx
;

proc sql;
create table want(drop=v) as
select *, max(value = 'xxxxxx') as v
from have
group by grp
having v = 0;
quit;

PROC Star
Posts: 1,584

Re: Find if a group contains a value to create new variable (SAS 9.4)

Posted in reply to novinosrin

This is cleaner than previous:

 

proc sql;
create table want as
select *
from have
group by grp
having max(value = 'xxxxxx')=0;
quit;

PROC Star
Posts: 1,449

Re: Find if a group contains a value to create new variable (SAS 9.4)

Posted in reply to putteringpluie

This can be done with DATA step, but the PROC SQL solution is probably more straight forward:

 

data have;
  input group $2. value $10.;
  cards;
WB abc 
WB asdf 
WB cwe 
AP sdf 
AP xxxxxx 
AP ter 
TN asdf 
TN ad 
TN sdr 
TN wer 
OD sser 
OD wry 
OD gyu 
OD xxxxxx 
;

proc sql;
  create table want as 
  select *
  from have 
  group by group
  having max(value='xxxxxx')=0
  ;
quit;

On the having clause, SAS will evaluate value='xxxxxx' as either 1 (true) or 0 (false).  You can use the aggregate MAX() function to find those records where the maximum value within the group is 0 (i.e. no records have a value of 'xxxxxx').

 

 

This works in PROC SQL (but would not work in most SQL implementations) because in SAS you are allowed to include non-aggregated variables on the select clause even though there is a  group by clause.  You will see a note in the log that SAS "remerged" the records.

PROC Star
Posts: 8,149

Re: Find if a group contains a value to create new variable (SAS 9.4)

Posted in reply to putteringpluie

Here is one way to do it using a datastep:

data have;
  infile cards dlm='09'x;
  input (group value) ($);
  cards;
WB	abc
WB	asdf
WB	cwe
AP	sdf
AP	xxxxxx
AP	ter
TN	asdf
TN	ad
TN	sdr
TN	wer
OD	sser
OD	wry
OD	gyu
OD	xxxxxx
;

data want;
  do until (last.group);
    set have;
    by group notsorted;
    if value='xxxxxx' then drop=1;
  end;
  do until (last.group);
    set have;
    by group notsorted;
    if not drop then output;
  end;
run;

Art, CEO, AnalystFinder.com

 

Occasional Contributor
Posts: 9

Re: Find if a group contains a value to create new variable (SAS 9.4)

You solution is interesting. Just so I better understand, can you please explain, how this is working and why we need two passes at data instead of one. 

Solution
‎12-27-2017 03:49 PM
PROC Star
Posts: 8,149

Re: Find if a group contains a value to create new variable (SAS 9.4)

Posted in reply to putteringpluie

Two passes are needed in both the datastep and with proc sql, first to identify whether any record in a by group meets the condition, then a second to actually select the records.

 

The solution I suggested uses what is called a DOW loop. It takes over the normal record by record processing normally accomplished by SAS in a datastep.

 

The first part:

data want;
  do until (last.group);
    set have;
    by group notsorted;
    if value='xxxxxx' then drop=1;
  end;

goes through each by group, identifies if any of its records contain 'xxxxxx' for the variable value and, if at least one does, it sets the value of drop to equal 1.

 

The second part:

  do until (last.group);
    set have;
    by group notsorted;
    if not drop then output;
  end;
run;

is run as soon as the last record in a by group is read and only works on that by group. The value of drop is available throughout the processing of the by group. When it get to the last record of a by group it then passes control back to the first part of the DOW loop to begin processing the next group. When it does so, the value of drop is initially set to missing for the next by group.

 

A lot more is going on and can be done using a DOW loop. You can read about it at: support.sas.com/resources/papers/proceedings12/052-2012.pdf and support.sas.com/resources/papers/proceedings12/156-2012.pdf

 

I find the DOW loop to be more versatile than trying to do the same thing in SQL and it runs more than twice as fast as its SQL alternative.

 

Art, CEO, AnalystFinder.com

 

PROC Star
Posts: 1,215

Re: Find if a group contains a value to create new variable (SAS 9.4)

Posted in reply to putteringpluie
data have;
  input group $2. value $10.;
  cards;
WB abc 
WB asdf 
WB cwe 
AP sdf 
AP xxxxxx 
AP ter 
TN asdf 
TN ad 
TN sdr 
TN wer 
OD sser 
OD wry 
OD gyu 
OD xxxxxx 
;

proc sql;
   create table want as
   select *
   from have
   where group not in
   (select distinct group from have where value="xxxxxx");
quit;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 344 views
  • 4 likes
  • 5 in conversation