How to delete a group of observations based on below conditions
/*if birthnum 2 and 3 or 2 and 6*/
/*How to delete a group of observations based on below conditions*/
/*if birthnum 2 and 3 or 2 and 6*/
data have;
input id case_id birthnum $;
datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91 B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;
data want;
if _n_=1 then
do;
dcl hash h1(dataset:'have(where= (birthnum in ("2" "3" "6")))');
h1.defineKey('id');
h1.defineDone();
end;
set have;
if h1.check()=0 then delete;
run;
proc print data=want;
run;
The output I want: However I am not getting this. It's zero Records am getting.
2 695 B
2 698 2
2 699 5
4 695 2
4 698 5
4 699 B
Given the data are already sorted by ID, this is a single data step task. Merge the subset with birthnum='2' with the subset having birtnum='3' or '6' with the entire set. Keep those observations that don't have at least one member in both of the subsets:
data have;
input id case_id birthnum $;
datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91 B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;
data want;
merge have (where=(birthnum='2') in=in2)
have (where=(birthnum='3' or birthnum='6') in=in3or6)
have ;
by id;
if not (in2=1 and in3or6=1);
run;
have you tested your code? I reran it making check=h1.check(). check=0 for all of them. that's why you are not returning any observations.
If I am writing check=h1.check() . It's creating a new variable check having missing value. I am not getting my expected output.
I suggested doing that so you could see the values are all 0 and that's why they got deleted. you should always open up datasets and see if you got desired results. it's a good way to check your work.
I think you are going to need to explain this more clearly
/*if birthnum 2 and 3 or 2 and 6*/
How can a single observation (row) have a variable (birthnum) with values 2 and 3
if you look at the data ID 1 and 3 both have birthnum=2 and 3 or birthnum=2 and 6. I think the OP wishes to delete ones that ever have birthnum=2 and birthnum=3 or birthnum=2 and birthnum=6. that's the way I interpreted it from OP's post and desired output.
Given the data are already sorted by ID, this is a single data step task. Merge the subset with birthnum='2' with the subset having birtnum='3' or '6' with the entire set. Keep those observations that don't have at least one member in both of the subsets:
data have;
input id case_id birthnum $;
datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91 B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;
data want;
merge have (where=(birthnum='2') in=in2)
have (where=(birthnum='3' or birthnum='6') in=in3or6)
have ;
by id;
if not (in2=1 and in3or6=1);
run;
This is an elegant solution with not too many lines of code!
data have;
input id case_id birthnum $;
datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91 B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;
proc sql;
create table want as
select * from have
group by id
having not (sum(birthnum='2') and sum(birthnum in ('3' '6')));
quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.