- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
How to delete a group of observations based on below conditions
/*if birthnum 2 and 3 or 2 and 6*/
/*How to delete a group of observations based on below conditions*/ /*if birthnum 2 and 3 or 2 and 6*/ data have; input id case_id birthnum $; datalines; 1 695 1 1 698 2 1 699 3 2 695 B 2 698 2 2 699 5 3 91 B 3 698 2 3 695 B 3 697 6 4 695 2 4 698 5 4 699 B ; data want; if _n_=1 then do; dcl hash h1(dataset:'have(where= (birthnum in ("2" "3" "6")))'); h1.defineKey('id'); h1.defineDone(); end; set have; if h1.check()=0 then delete; run; proc print data=want; run; The output I want: However I am not getting this. It's zero Records am getting. 2 695 B 2 698 2 2 699 5 4 695 2 4 698 5 4 699 B
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Given the data are already sorted by ID, this is a single data step task. Merge the subset with birthnum='2' with the subset having birtnum='3' or '6' with the entire set. Keep those observations that don't have at least one member in both of the subsets:
data have;
input id case_id birthnum $;
datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91 B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;
data want;
merge have (where=(birthnum='2') in=in2)
have (where=(birthnum='3' or birthnum='6') in=in3or6)
have ;
by id;
if not (in2=1 and in3or6=1);
run;
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
have you tested your code? I reran it making check=h1.check(). check=0 for all of them. that's why you are not returning any observations.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If I am writing check=h1.check() . It's creating a new variable check having missing value. I am not getting my expected output.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I suggested doing that so you could see the values are all 0 and that's why they got deleted. you should always open up datasets and see if you got desired results. it's a good way to check your work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think you are going to need to explain this more clearly
/*if birthnum 2 and 3 or 2 and 6*/
How can a single observation (row) have a variable (birthnum) with values 2 and 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
if you look at the data ID 1 and 3 both have birthnum=2 and 3 or birthnum=2 and 6. I think the OP wishes to delete ones that ever have birthnum=2 and birthnum=3 or birthnum=2 and birthnum=6. that's the way I interpreted it from OP's post and desired output.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Given the data are already sorted by ID, this is a single data step task. Merge the subset with birthnum='2' with the subset having birtnum='3' or '6' with the entire set. Keep those observations that don't have at least one member in both of the subsets:
data have;
input id case_id birthnum $;
datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91 B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;
data want;
merge have (where=(birthnum='2') in=in2)
have (where=(birthnum='3' or birthnum='6') in=in3or6)
have ;
by id;
if not (in2=1 and in3or6=1);
run;
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is an elegant solution with not too many lines of code!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have; input id case_id birthnum $; datalines; 1 695 1 1 698 2 1 699 3 2 695 B 2 698 2 2 699 5 3 91 B 3 698 2 3 695 B 3 697 6 4 695 2 4 698 5 4 699 B ; proc sql; create table want as select * from have group by id having not (sum(birthnum='2') and sum(birthnum in ('3' '6'))); quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content