DATA Step, Macro, Functions and more

Selecting rows with only one unique value per subject

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Selecting rows with only one unique value per subject

The dataset looks something like this 

 

VENUE  SPORT

A           Hockey

B           Basketball

C           Softball 

B           Hockey 

C           Softball

A           Football 

 

I need to select only venues with one unique sport played there, so in this case Venue C would get selected because only one sport (softball) is played there. Any help would be greatly appreciated. Thanks.  


Accepted Solutions
Solution
‎01-03-2017 11:56 AM
Super User
Posts: 10,044

Re: Selecting rows with only one unique value per subject


data have;
input VENUE $ SPORT : $20.;
cards;
A           Hockey
B           Basketball
C           Softball 
B           Hockey 
C           Softball
A           Football 
;
run;
proc sql;
create table want as
 select *
  from have
   group by venue
    having count(distinct sport)=1;
quit;

View solution in original post


All Replies
Trusted Advisor
Posts: 1,137

Re: Selecting rows with only one unique value per subject

[ Edited ]

proc sort data=have dupout=want nodup;

by venue sport;

run;

 

want dataset will have your expected output.

Thanks,
Jag
New Contributor
Posts: 3

Re: Selecting rows with only one unique value per subject

Posted in reply to Jagadishkatam

Nodup deletes duplicated observations. I'm after retaining records with a single unique sport per venue and removing those with more than than one unique observation. Nodup retains venues with multiple unique observations. 

Trusted Advisor
Posts: 1,137

Re: Selecting rows with only one unique value per subject

Posted in reply to Jagadishkatam

by data step

 

proc sort data=have nodupkey;
by venue sport;
run;

data want;
set have;
by venue sport;
retain cnt;
if first.venue then cnt=1;
else cnt=cnt+1;
if last.venue and cnt<=1;
run;
Thanks,
Jag
Super User
Posts: 5,516

Re: Selecting rows with only one unique value per subject

Another approach:

 

proc freq data=have;

tables venue * sport / noprint out=combinations;

run;

 

proc freq data=combinations;

tables venue / noprint out=want (where=(count=1));

run;

 

proc print data=want;

run;

Super User
Posts: 19,860

Re: Selecting rows with only one unique value per subject

SQL again offers a one step solution.

 

proc sql;

create table want as

select venue,  count(distinct sport) as num_sports

from have

group by venue

where calculated num_sports=1;

quit;

Solution
‎01-03-2017 11:56 AM
Super User
Posts: 10,044

Re: Selecting rows with only one unique value per subject


data have;
input VENUE $ SPORT : $20.;
cards;
A           Hockey
B           Basketball
C           Softball 
B           Hockey 
C           Softball
A           Football 
;
run;
proc sql;
create table want as
 select *
  from have
   group by venue
    having count(distinct sport)=1;
quit;

New Contributor
Posts: 3

Re: Selecting rows with only one unique value per subject

This solution worked great. It's particularly useful if duplicate records need to be retain (like dollar amounts for each row). Thank you!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 299 views
  • 3 likes
  • 5 in conversation