The dataset looks something like this
VENUE SPORT
A Hockey
B Basketball
C Softball
B Hockey
C Softball
A Football
I need to select only venues with one unique sport played there, so in this case Venue C would get selected because only one sport (softball) is played there. Any help would be greatly appreciated. Thanks.
data have; input VENUE $ SPORT : $20.; cards; A Hockey B Basketball C Softball B Hockey C Softball A Football ; run; proc sql; create table want as select * from have group by venue having count(distinct sport)=1; quit;
proc sort data=have dupout=want nodup;
by venue sport;
run;
want dataset will have your expected output.
Nodup deletes duplicated observations. I'm after retaining records with a single unique sport per venue and removing those with more than than one unique observation. Nodup retains venues with multiple unique observations.
by data step
proc sort data=have nodupkey;
by venue sport;
run;
data want;
set have;
by venue sport;
retain cnt;
if first.venue then cnt=1;
else cnt=cnt+1;
if last.venue and cnt<=1;
run;
Another approach:
proc freq data=have;
tables venue * sport / noprint out=combinations;
run;
proc freq data=combinations;
tables venue / noprint out=want (where=(count=1));
run;
proc print data=want;
run;
SQL again offers a one step solution.
proc sql;
create table want as
select venue, count(distinct sport) as num_sports
from have
group by venue
where calculated num_sports=1;
quit;
data have; input VENUE $ SPORT : $20.; cards; A Hockey B Basketball C Softball B Hockey C Softball A Football ; run; proc sql; create table want as select * from have group by venue having count(distinct sport)=1; quit;
This solution worked great. It's particularly useful if duplicate records need to be retain (like dollar amounts for each row). Thank you!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.