I have an input dataset with two columns(system and class). I want to create a dataset from that containing only those systems that only have a class of x. The same system can have multiple entries. If one entry has a class of X, and another entry a different class, then I don't want the system returned. There are 36 potential values of class(a-z and 0-9).
My input looks like this:
data have;
input system $ class $ ;
datalines;
fred a
fred x
wilma a
wilma b
wilma b
betty x
betty x
;
run;
From this I would only want Betty returned. As Fred has a class of both A & X, while Wilma doesn't have X, I don't want either of them. Betty is the only one that only has a class of X.
I'm assuming PROC SQL would be the way to go here. Though I'm struggling to get a query that can return what I'm after.
HI @serge68 A way to approach is to check
1. Distinct values of class within BY GROUP system
2. We need the distinct value as 1 and that value should be 'x'
So in SQL syntax
data have;
input system $ class $ ;
datalines;
fred a
fred x
wilma a
wilma b
wilma b
betty x
betty x
;
run;
proc sql;
create table want as
select *
from have
group by system
having count(distinct class)=1 and class='x';
quit;
Try this
data have;
input system $ class $ ;
datalines;
fred a
fred x
wilma a
wilma b
wilma b
betty x
betty x
;
run;
proc sql;
create table want as
select * from have
group by system
having sum(class='x')=count(system);
quit;
Assuming you want to select rows for any case where there is only one class per system and not only when it's 'x'
proc sql;
create table want as
select *
from have
group by system
having count(distinct class)=1
;
quit;
HI @serge68 A way to approach is to check
1. Distinct values of class within BY GROUP system
2. We need the distinct value as 1 and that value should be 'x'
So in SQL syntax
data have;
input system $ class $ ;
datalines;
fred a
fred x
wilma a
wilma b
wilma b
betty x
betty x
;
run;
proc sql;
create table want as
select *
from have
group by system
having count(distinct class)=1 and class='x';
quit;
Data step solution:
data want;
ind_x = 0;
do until (last.system);
set have;
by system notsorted;
if ind_x ne . and class = 'x' then ind_x = 1;
if class ne 'x' then ind_x = .;
end;
do until (last.system);
set have;
by system notsorted;
if ind_x then output;
end;
drop ind_x;
run;
A DATA step can process pre-grouped data and keep one row for each system having only class='x' cases
DATA want(keep=system);
do until (last.system);
set have;
by system notsorted;
if not not_x then not_x = class ne 'x';
end;
if not not_x;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.