SAS Programming

Spintu · Posted 03-31-2022 10:21 AM

How to delete a group of observations based on below conditions
/*if birthnum 2 and 3 or 2 and 6*/

/*How to delete a group of observations based on below conditions*/
/*if birthnum 2 and 3 or 2 and 6*/
 
data have;
  input id case_id birthnum $;
  datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91  B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;

data want;
  if _n_=1 then
    do;
      dcl hash h1(dataset:'have(where= (birthnum in ("2" "3" "6")))');
      h1.defineKey('id');
      h1.defineDone();
    end;
  set have;
  if h1.check()=0 then delete;
run;

proc print data=want;
run;


The output I want: However I am not getting this. It's zero Records am getting. 
2 695 B
2 698 2
2 699 5
4 695 2
4 698 5
4 699 B

mkeintz · Posted 03-31-2022 10:32 AM

Given the data are already sorted by ID, this is a single data step task. Merge the subset with birthnum='2' with the subset having birtnum='3' or '6' with the entire set. Keep those observations that don't have at least one member in both of the subsets:

data have;
  input id case_id birthnum $;
  datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91  B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;
data want;
  merge have (where=(birthnum='2') in=in2)
        have (where=(birthnum='3' or birthnum='6') in=in3or6)
        have ;
  by id;
  if not (in2=1 and in3or6=1);
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

tarheel13 · Posted 03-31-2022 10:27 AM

have you tested your code? I reran it making check=h1.check(). check=0 for all of them. that's why you are not returning any observations.

Spintu · Posted 03-31-2022 10:49 AM

If I am writing check=h1.check() . It's creating a new variable check having missing value. I am not getting my expected output.

tarheel13 · Posted 03-31-2022 10:51 AM

I suggested doing that so you could see the values are all 0 and that's why they got deleted. you should always open up datasets and see if you got desired results. it's a good way to check your work.

AMSAS · Posted 03-31-2022 10:27 AM

I think you are going to need to explain this more clearly

/*if birthnum 2 and 3 or 2 and 6*/

How can a single observation (row) have a variable (birthnum) with values 2 and 3

tarheel13 · Posted 03-31-2022 10:32 AM

if you look at the data ID 1 and 3 both have birthnum=2 and 3 or birthnum=2 and 6. I think the OP wishes to delete ones that ever have birthnum=2 and birthnum=3 or birthnum=2 and birthnum=6. that's the way I interpreted it from OP's post and desired output.

mkeintz · Posted 03-31-2022 10:32 AM

Given the data are already sorted by ID, this is a single data step task. Merge the subset with birthnum='2' with the subset having birtnum='3' or '6' with the entire set. Keep those observations that don't have at least one member in both of the subsets:

data have;
  input id case_id birthnum $;
  datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91  B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;
data want;
  merge have (where=(birthnum='2') in=in2)
        have (where=(birthnum='3' or birthnum='6') in=in3or6)
        have ;
  by id;
  if not (in2=1 and in3or6=1);
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

tarheel13 · Posted 03-31-2022 10:47 AM

This is an elegant solution with not too many lines of code!

Ksharp · Posted 04-01-2022 09:28 AM

 
data have;
  input id case_id birthnum $;
  datalines;
1 695 1
1 698 2
1 699 3
2 695 B
2 698 2
2 699 5
3 91  B
3 698 2
3 695 B
3 697 6
4 695 2
4 698 5
4 699 B
;


proc sql;
create table want as
select * from have
 group by id
  having not (sum(birthnum='2') and sum(birthnum in ('3' '6')));
quit;

Spintu · Posted 04-01-2022 01:15 PM

Thank you! Good to know that.

SAS Programming

How to delete a group of observations based on below conditions

Re: How to delete a group of observations based on below conditions

Re: How to delete a group of observations based on below conditions

Re: How to delete a group of observations based on below conditions

Re: How to delete a group of observations based on below conditions

Re: How to delete a group of observations based on below conditions

Re: How to delete a group of observations based on below conditions

Re: How to delete a group of observations based on below conditions

Re: How to delete a group of observations based on below conditions

Re: How to delete a group of observations based on below conditions

Re: How to delete a group of observations based on below conditions

Delete observations in BY group on condition

Delete group of observation with certain condition

Conditional statements and grouping observations

Delete group on condition

Export based on condition of dataset

Follow Us

What is...

SAS Programming

Register Today!

SAS Training: Just a Click Away

Follow Us

What is...