BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
genemroz
Quartz | Level 8

Esteemed Advisers

 

I need some help with selectively deleting observations by groups.  Below are exemplar “have”and “want” datasets.  The goal is to retain only one Member that contains “A” in both the Alpha and Beta Groups (which one is retained is irrelevant as they are redundant).  

 

Thanks for taking a look and any advice you have to offer,.

Gene

 

data have;
input Group $ Member $;
Alpha A1
Alpha A2
Alpha B1
Alpha B2
Alpha A3
Alpha C1
Alpha C2
Beta A1
Beta A2
Beta D1
Beta D2
Beta E1
Beta E2
Beta A3
;

data want;
input Group $ Member $;
Alpha A1
Alpha B1
Alpha B2
Alpha C1
Alpha C2
Beta A1
Beta D1
Beta D2
Beta E1
Beta E2
;
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Assuming the data is sorted by GROUP you just to count the number of names MEMBER starts with A per group.

data want;
  set have;
  by group ;
  if first.group then n_a=0;
  if member=:'A' then do;
    n_a+1;
    if n_a>1 then delete;
  end;
run;

View solution in original post

4 REPLIES 4
Ksharp
Super User
data have;
input Group $ Member $;
cards;
Alpha A1
Alpha A2
Alpha B1
Alpha B2
Alpha A3
Alpha C1
Alpha C2
Beta A1
Beta A2
Beta D1
Beta D2
Beta E1
Beta E2
Beta A3
;
proc sql;
create table temp as
select distinct member from have group by member having count(distinct group)>1;

create table want as
select * from have where member in (select member from temp(obs=1))
union
select * from have where member not in (select member from temp);
quit;
Patrick
Opal | Level 21

Here another option. Not as smart as what @Ksharp proposes but should perform better (less sorting and lookup) in case you've got a lot of data and performance is relevant to you.

data have;
  input Group $ Member $;
cards;
Alpha A1
Alpha A2
Alpha B1
Alpha B2
Alpha A3
Alpha C1
Alpha C2
Beta A1
Beta A2
Beta D1
Beta D2
Beta E1
Beta E2
Beta A3
;

proc sort 
  data=have(keep=group member where=(group in ('Alpha','Beta') and member =:'A')) 
  out=inter 
  presorted;
  by group member;
run;

%let memA_sel=; data _null_; merge inter(in=ina where=(group='Alpha')) inter(in=inb where=(group='Beta' )) ; by member; if ina and inb then do; call symputx('memA_sel',member); stop; end; run; data want; set have; if group in ('Alpha','Beta') and member=:'A' and member ne "&memA_sel" then delete; run; proc print data=want; run;

 If you can be sure that your data have is already sorted by group and member (or by member as first variable) then use below code version.

%let memA_sel=;
data _null_;
  merge 
    have(in=ina where=(group='Alpha' and member =:'A'))
    have(in=inb where=(group='Beta'  and member =:'A'))
    ;
  by member;
  if ina and inb then
    do;
      call symputx('memA_sel',member);
      stop;
    end;
run;

data want;
  set have;
  if group in ('Alpha','Beta') and member=:'A' and member ne "&memA_sel" then delete;
run;

 

Tom
Super User Tom
Super User

Assuming the data is sorted by GROUP you just to count the number of names MEMBER starts with A per group.

data want;
  set have;
  by group ;
  if first.group then n_a=0;
  if member=:'A' then do;
    n_a+1;
    if n_a>1 then delete;
  end;
run;
genemroz
Quartz | Level 8

Thanks to all who offered solutions.  All were valid and I chose to accept Tom's only because of its parsimonious use of code.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 721 views
  • 1 like
  • 4 in conversation