BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Xusheng
Obsidian | Level 7

Hi, I'd like to remove all the observations of one date whenever the If condition meets. For example, if 'OilCount >= 8' then all the observations of that date are removed (rather than the observations that meet 'OilCount >= 24' within that date.) I appreciate any advice. Thank you.

data have;
input OilCount date :ddmmyy10. ;
datalines;
0	03/09/2012
6	03/09/2012
0	04/09/2012
5	04/09/2012
1	04/09/2012
4	05/09/2012
5	06/09/2012
3	06/09/2012
6	06/09/2012
8	06/09/2012
11	07/09/2012
3	08/09/2012
2	08/09/2012
4	10/09/2012


run;
data want;
input OilCount date :ddmmyy10. ;
datalines;
0	03/09/2012
6	03/09/2012
0	04/09/2012
5	04/09/2012
1	04/09/2012
4	05/09/2012
3	08/09/2012
2	08/09/2012
4	10/09/2012

run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@Xusheng 

Not sure why below row is not in your want table. Is this a typo?

11	07/09/2012

 

Assuming your actual selection conditions could be of higher complexity than what you've posted:

- You could first create an intermediary table with rows meeting your condition.

- You then use this intermediary table as a lookup table to only select rows in your source data which don't have a match to the lookup table.

data have;
 input OilCount date ddmmyy10.;
 format date date9.;
 datalines;
0 03/09/2012
6 03/09/2012
0 04/09/2012
5 04/09/2012
1 04/09/2012
4 05/09/2012
5 06/09/2012
3 06/09/2012
6 06/09/2012
8 06/09/2012
11 07/09/2012
3 08/09/2012
2 08/09/2012
4 10/09/2012
;
run;

data DelObs/view=DelObs;
  set have;
  keep date;
  if OilCount>=8;
run;

data want;
  if _n_=1 then
    do;
      dcl hash h1(dataset:'DelObs');
      h1.defineKey('date');
      h1.defineDone();  
    end;

  set have;
  if h1.check() ne 0 then output;
run;

View solution in original post

8 REPLIES 8
novinosrin
Tourmaline | Level 20

You could make a simpler/smaller sample for dumb me to understand your requirement. Also post an output sample i.e expected requirement for the smaller input sample you may while explaining the logic

Xusheng
Obsidian | Level 7
Ok, I will update with a smaller sample. Thank you. ##- Please type your
reply above this line. No attachments. -##
novinosrin
Tourmaline | Level 20

Thank you @Xusheng , kind of you. Just waking up. Will have a coffee and take a look shortly

Xusheng
Obsidian | Level 7
Thank you. Take your time##- Please type your reply above this line. No
attachments. -##
novinosrin
Tourmaline | Level 20

Hi @Xusheng   The condition isn't clear. 

condition:  , if 'OilCount >= 8' 

Please take a look at the below that has the cumulative count on the right. do you mean this?

Oil
Count date cumulative_count

0 03/09/2012 0
6 03/09/2012 6
0 04/09/2012 0
5 04/09/2012 5
1 04/09/2012 6
4 05/09/2012 4
5 06/09/2012 5
3 06/09/2012 8     Here, the cumulative oil count which is 5+3 satisfies >=8 
6 06/09/2012 14
8 06/09/2012 22
11 07/09/2012 11  and likewise the same here 11>=8 satisfies the condition
3 08/09/2012 3
2 08/09/2012 5
4 10/09/2012 4

 

Also, what do you mean by-

rather than the observations that meet 'OilCount >= 24' within that date?

Xusheng
Obsidian | Level 7

Hi Novinosrin, I have updated the question with smaller sample of 'have' and 'want'. Please kindly let me know whether this sample makes sense or not.

Patrick
Opal | Level 21

@Xusheng 

Not sure why below row is not in your want table. Is this a typo?

11	07/09/2012

 

Assuming your actual selection conditions could be of higher complexity than what you've posted:

- You could first create an intermediary table with rows meeting your condition.

- You then use this intermediary table as a lookup table to only select rows in your source data which don't have a match to the lookup table.

data have;
 input OilCount date ddmmyy10.;
 format date date9.;
 datalines;
0 03/09/2012
6 03/09/2012
0 04/09/2012
5 04/09/2012
1 04/09/2012
4 05/09/2012
5 06/09/2012
3 06/09/2012
6 06/09/2012
8 06/09/2012
11 07/09/2012
3 08/09/2012
2 08/09/2012
4 10/09/2012
;
run;

data DelObs/view=DelObs;
  set have;
  keep date;
  if OilCount>=8;
run;

data want;
  if _n_=1 then
    do;
      dcl hash h1(dataset:'DelObs');
      h1.defineKey('date');
      h1.defineDone();  
    end;

  set have;
  if h1.check() ne 0 then output;
run;

Astounding
PROC Star

What does it mean to say OilCount > 8 for a given date?  Is that OilCount on any observation for that date?  Is it the sum of all OilCounts for that date?  

 

Here is a framework you can use ... you can modify the conditions and add more if you would like.  Assuming you want to delete observations for the date when the total OilCount for that date is 8 or more:

 

data want;

total_oil_count = 0;

do until (last.date);

   set have;

   by date;

   total_oil_count + oil_count;

end;

do until (last.date);

   set have;

   by date;

   if total_oil_count < 8 then output;

end;

drop total_oil_count;

run;

 

Some of the tricky aspects:

 

  • Use an OUTPUT statement to output the observations you want (not a DELETE statement to remove observations).
  • The top loop reads all observations for a date and computes the conditions needed to determine output vs. delete.
  • The bottom loop then re-reads the exact same observations, and uses the results of the top loop to output just the desired observations.
  • If you have computed multiple conditions in the top loop, the bottom loop can still use the results ... but must still use an OUTPUT statement, not a DELETE statement.  Adjust the logic accordingly.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1408 views
  • 1 like
  • 4 in conversation