I want to group by name and ID, and want to delete the entire group of observations whenever both num_1 and num_2 = 1 or both = 0.
What I have:
name ID num_1 num_2
A 10 1 0
A 10 0 1
A 34 1 1
B 9 1 0
B 9 0 0
B 9 0 1
C 28 1 1
C 28 1 0
C 28 0 0
C 30 1 0
C 30 0 1
What I want:
name ID num_1 num_2
A 10 1 0
A 10 0 1
C 30 1 0
C 30 0 1
This is a good case for merging the dataset against a subset of itself. If the subset is defined by NUM_1=NUM_2, then the merge should be excluded. For any NAME/ID combination that has no such subset, keep all the observations:
data have;
input name : $1. id num_1 num_2 : ;
datalines;
A 10 1 0
A 10 0 1
A 34 1 1
B 9 1 0
B 9 0 0
B 9 0 1
C 28 1 1
C 28 1 0
C 28 0 0
C 30 1 0
C 30 0 1
;
data want;
merge have
have (in=exclude where=(num_1=num_2));
by name id;
if not exclude;
run;
Even though the subset ("where=(num_1=num_2)") is commonly smaller than the set of observations for a given ID/NAME, the MERGE statement will keep the EXCLUDE dummy=1 for the entire group. This provides a trivial indicator for excluding every observation in the group.
Hi,
I have assumed num_1 and num_2 only hold values of 1 or 0:
data have;
input
name : $1.
id : 8.
num_1 : 8.
num_2 : 8.
;
datalines;
A 10 1 0
A 10 0 1
A 34 1 1
B 9 1 0
B 9 0 0
B 9 0 1
C 28 1 1
C 28 1 0
C 28 0 0
C 30 1 0
C 30 0 1
;
data want;
do until(last.id);
set have;
by name id;
if num_1 eq num_2 then
remove = 1;
end;
do until(last.id);
set have;
by name id;
if not remove then
output;
end;
run;
Kind regards,
Amir.
This is a good case for merging the dataset against a subset of itself. If the subset is defined by NUM_1=NUM_2, then the merge should be excluded. For any NAME/ID combination that has no such subset, keep all the observations:
data have;
input name : $1. id num_1 num_2 : ;
datalines;
A 10 1 0
A 10 0 1
A 34 1 1
B 9 1 0
B 9 0 0
B 9 0 1
C 28 1 1
C 28 1 0
C 28 0 0
C 30 1 0
C 30 0 1
;
data want;
merge have
have (in=exclude where=(num_1=num_2));
by name id;
if not exclude;
run;
Even though the subset ("where=(num_1=num_2)") is commonly smaller than the set of observations for a given ID/NAME, the MERGE statement will keep the EXCLUDE dummy=1 for the entire group. This provides a trivial indicator for excluding every observation in the group.
data have;
input name : $1. id num_1 num_2 : ;
datalines;
A 10 1 0
A 10 0 1
A 34 1 1
B 9 1 0
B 9 0 0
B 9 0 1
C 28 1 1
C 28 1 0
C 28 0 0
C 30 1 0
C 30 0 1
;
proc sql;
create table want as
select * from have group by name,id
having not( sum(num_1=0 and num_2=0) or sum(num_1=1 and num_2=1) ) ;
quit;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.