How to delete a group of observations based on below conditions
/*if birthnum 2 and 3 or 2 and 6*/
/*How to delete a group of observations based on below conditions*/ /*if birthnum 2 and 3 or 2 and 6*/ data have; input id case_id birthnum $; datalines; 1 695 1 1 698 2 1 699 3 2 695 B 2 698 2 2 699 5 3 91 B 3 698 2 3 695 B 3 697 6 4 695 2 4 698 5 4 699 B ; data want; if _n_=1 then do; dcl hash h1(dataset:'have(where= (birthnum in ("2" "3" "6")))'); h1.defineKey('id'); h1.defineDone(); end; set have; if h1.check()=0 then delete; run; proc print data=want; run; The output I want: However I am not getting this. It's zero Records am getting. 2 695 B 2 698 2 2 699 5 4 695 2 4 698 5 4 699 B
Given the data are already sorted by ID, this is a single data step task. Merge the subset with birthnum='2' with the subset having birtnum='3' or '6' with the entire set. Keep those observations that don't have at least one member in both of the subsets:
data have;
input id case_id birthnum $;
datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91 B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;
data want;
merge have (where=(birthnum='2') in=in2)
have (where=(birthnum='3' or birthnum='6') in=in3or6)
have ;
by id;
if not (in2=1 and in3or6=1);
run;
have you tested your code? I reran it making check=h1.check(). check=0 for all of them. that's why you are not returning any observations.
If I am writing check=h1.check() . It's creating a new variable check having missing value. I am not getting my expected output.
I suggested doing that so you could see the values are all 0 and that's why they got deleted. you should always open up datasets and see if you got desired results. it's a good way to check your work.
I think you are going to need to explain this more clearly
/*if birthnum 2 and 3 or 2 and 6*/
How can a single observation (row) have a variable (birthnum) with values 2 and 3
if you look at the data ID 1 and 3 both have birthnum=2 and 3 or birthnum=2 and 6. I think the OP wishes to delete ones that ever have birthnum=2 and birthnum=3 or birthnum=2 and birthnum=6. that's the way I interpreted it from OP's post and desired output.
Given the data are already sorted by ID, this is a single data step task. Merge the subset with birthnum='2' with the subset having birtnum='3' or '6' with the entire set. Keep those observations that don't have at least one member in both of the subsets:
data have;
input id case_id birthnum $;
datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91 B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;
data want;
merge have (where=(birthnum='2') in=in2)
have (where=(birthnum='3' or birthnum='6') in=in3or6)
have ;
by id;
if not (in2=1 and in3or6=1);
run;
This is an elegant solution with not too many lines of code!
data have; input id case_id birthnum $; datalines; 1 695 1 1 698 2 1 699 3 2 695 B 2 698 2 2 699 5 3 91 B 3 698 2 3 695 B 3 697 6 4 695 2 4 698 5 4 699 B ; proc sql; create table want as select * from have group by id having not (sum(birthnum='2') and sum(birthnum in ('3' '6'))); quit;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.