BookmarkSubscribeRSS Feed
Babloo
Rhodochrosite | Level 12

I've a data as follows as I want to delete the last record of the reporting date 10/09/2018.

 

Input data:

 

Reporting_date Error_id
10/09/2018 12
10/09/2018 24
11/09/2018 11
12/09/2018 13
13/09/2018 16

 

Program:

 

 

proc sql;
create table test as select * from dex.FUNCTIONAL_ERRORS;
quit;

proc sort data=test;
by reporting_date error_id;
run;

data temp;
set test;
by reporting_date error_id;
if last.reporting_date=10/09/2018 then delete;
/*else new_reporting_date;*/
run;

Extract of log:

 

33         
34         data temp;
35         set test;
36         by reporting_date error_id;
37         if last.reporting_date=10/09/2018 then delete;
38         /*else new_reporting_date;*/
39         run;

NOTE: There were 5 observations read from the data set WORK.NEW_TEST.
NOTE: The data set WORK.TEMP has 5 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

May I know why the last record of 10/09/2018 is not getting deleted? Reporting _date is numeric variable with format ddmmyy10.

 

Desired Output:

 

Reporting_date Error_id
10/09/2018 12
11/09/2018 11
12/09/2018 13
13/09/2018 16

 

4 REPLIES 4
gamotte
Rhodochrosite | Level 12

Hello last.var is an automatic variable created by the "by var;" instruction.

last.var=1 if the observation is the last one for the current value of var, 0 otherwise.

 

It makes no sense to compare it to a date.

 

if last.reporting_date=10/09/2018

 

should be  (assuming reporting_date has a date format)

 

if last.reporting_date and reporting_date='10Sep2018'd

 

Babloo
Rhodochrosite | Level 12

It is not solving the issue.

 

6         data temp;
27         set test;
28         by reporting_date error_id;
29         if reporting_date and last.reporting_date='10sep2018'd then delete;
30         /*else new_reporting_date;*/
31         run;

NOTE: There were 7851 observations read from the data set WORK.TEST.
NOTE: The data set WORK.TEMP has 7851 observations and 12 variables.
gamotte
Rhodochrosite | Level 12

You are still comparing last.reporting_date, which is 0/1, to a date.

 

Try :

 

if last.reporting_date and reporting_date='10sep2018'd then delete;

 

Astounding
PROC Star

It seems you are describing only a small part of the problem.

 

  • What should happen if there are three observations for 10/09/2018 instead of two?
  • What should happen if some other date has three observations?

Given that you sort your data, here are a few ways to subset that you might want to use in the final DATA step:

 

if first.reporting_date;

Keeps just one observation per reporting date, whichever one is first.

 

if last.reporting_date and 0=first.reporting_date then delete;

Deletes the last observation per reporting date, as long as there is more than one.

 

The right solution depends on accurately describing the problem.  You may have done that, but I suspect you did it for just a small selection of the data.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1052 views
  • 0 likes
  • 3 in conversation