I have an input dataset with two columns(system and class). I want to create a dataset from that containing only those systems that only have a class of x. The same system can have multiple entries. If one entry has a class of X, and another entry a different class, then I don't want the system returned. There are 36 potential values of class(a-z and 0-9).
My input looks like this:
data have;
input system $ class $ ;
datalines;
fred a
fred x
wilma a
wilma b
wilma b
betty x
betty x
;
run;
From this I would only want Betty returned. As Fred has a class of both A & X, while Wilma doesn't have X, I don't want either of them. Betty is the only one that only has a class of X.
I'm assuming PROC SQL would be the way to go here. Though I'm struggling to get a query that can return what I'm after.
HI @serge68 A way to approach is to check
1. Distinct values of class within BY GROUP system
2. We need the distinct value as 1 and that value should be 'x'
So in SQL syntax
data have;
input system $ class $ ;
datalines;
fred a
fred x
wilma a
wilma b
wilma b
betty x
betty x
;
run;
proc sql;
create table want as
select *
from have
group by system
having count(distinct class)=1 and class='x';
quit;
Try this
data have;
input system $ class $ ;
datalines;
fred a
fred x
wilma a
wilma b
wilma b
betty x
betty x
;
run;
proc sql;
create table want as
select * from have
group by system
having sum(class='x')=count(system);
quit;
Assuming you want to select rows for any case where there is only one class per system and not only when it's 'x'
proc sql;
create table want as
select *
from have
group by system
having count(distinct class)=1
;
quit;
HI @serge68 A way to approach is to check
1. Distinct values of class within BY GROUP system
2. We need the distinct value as 1 and that value should be 'x'
So in SQL syntax
data have;
input system $ class $ ;
datalines;
fred a
fred x
wilma a
wilma b
wilma b
betty x
betty x
;
run;
proc sql;
create table want as
select *
from have
group by system
having count(distinct class)=1 and class='x';
quit;
Data step solution:
data want;
ind_x = 0;
do until (last.system);
set have;
by system notsorted;
if ind_x ne . and class = 'x' then ind_x = 1;
if class ne 'x' then ind_x = .;
end;
do until (last.system);
set have;
by system notsorted;
if ind_x then output;
end;
drop ind_x;
run;
A DATA step can process pre-grouped data and keep one row for each system having only class='x' cases
DATA want(keep=system);
do until (last.system);
set have;
by system notsorted;
if not not_x then not_x = class ne 'x';
end;
if not not_x;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.