BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Spintu
Quartz | Level 8

How to delete a group of observations based on below conditions
/*if birthnum 2 and 3 or 2 and 6*/

/*How to delete a group of observations based on below conditions*/
/*if birthnum 2 and 3 or 2 and 6*/
 
data have;
  input id case_id birthnum $;
  datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91  B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;

data want;
  if _n_=1 then
    do;
      dcl hash h1(dataset:'have(where= (birthnum in ("2" "3" "6")))');
      h1.defineKey('id');
      h1.defineDone();
    end;
  set have;
  if h1.check()=0 then delete;
run;

proc print data=want;
run;


The output I want: However I am not getting this. It's zero Records am getting. 
2 695 B
2 698 2
2 699 5
4 695 2
4 698 5
4 699 B
1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Given the data are already sorted by ID, this is a single data step task.  Merge the subset with birthnum='2' with the subset having birtnum='3' or '6' with the entire set.  Keep those observations that don't have at least one member in both of the subsets:

 

data have;
  input id case_id birthnum $;
  datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91  B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;
data want;
  merge have (where=(birthnum='2') in=in2)
        have (where=(birthnum='3' or birthnum='6') in=in3or6)
        have ;
  by id;
  if not (in2=1 and in3or6=1);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

9 REPLIES 9
tarheel13
Rhodochrosite | Level 12

have you tested your code? I reran it making check=h1.check(). check=0 for all of them. that's why you are not returning any observations. 

Spintu
Quartz | Level 8

If I am writing check=h1.check() . It's creating a new variable check having missing value. I am not getting my expected output.

tarheel13
Rhodochrosite | Level 12

I suggested doing that so you could see the values are all 0 and that's why they got deleted. you should always open up datasets and see if you got desired results. it's a good way to check your work. 

AMSAS
SAS Super FREQ

I think you are going to need to explain this more clearly 

 

/*if birthnum 2 and 3 or 2 and 6*/


How can a single observation (row) have a variable (birthnum) with values 2 and 3 


 

tarheel13
Rhodochrosite | Level 12

if you look at the data ID 1 and 3 both have birthnum=2 and 3 or birthnum=2 and 6. I think the OP wishes to delete ones that ever have birthnum=2 and birthnum=3 or birthnum=2 and birthnum=6. that's the way I interpreted it from OP's post and desired output.

mkeintz
PROC Star

Given the data are already sorted by ID, this is a single data step task.  Merge the subset with birthnum='2' with the subset having birtnum='3' or '6' with the entire set.  Keep those observations that don't have at least one member in both of the subsets:

 

data have;
  input id case_id birthnum $;
  datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91  B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;
data want;
  merge have (where=(birthnum='2') in=in2)
        have (where=(birthnum='3' or birthnum='6') in=in3or6)
        have ;
  by id;
  if not (in2=1 and in3or6=1);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tarheel13
Rhodochrosite | Level 12

This is an elegant solution with not too many lines of code!

Ksharp
Super User
 
data have;
  input id case_id birthnum $;
  datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91  B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;


proc sql;
create table want as
select * from have
 group by id
  having not (sum(birthnum='2') and sum(birthnum in ('3' '6')));
quit;
Spintu
Quartz | Level 8
Thank you! Good to know that.

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2156 views
  • 4 likes
  • 5 in conversation