BookmarkSubscribeRSS Feed
Babloo
Rhodochrosite | Level 12

I've a data as follows as I want to delete the last record of the reporting date 10/09/2018.

 

Input data:

 

Reporting_date Error_id
10/09/2018 12
10/09/2018 24
11/09/2018 11
12/09/2018 13
13/09/2018 16

 

Program:

 

 

proc sql;
create table test as select * from dex.FUNCTIONAL_ERRORS;
quit;

proc sort data=test;
by reporting_date error_id;
run;

data temp;
set test;
by reporting_date error_id;
if last.reporting_date=10/09/2018 then delete;
/*else new_reporting_date;*/
run;

Extract of log:

 

33         
34         data temp;
35         set test;
36         by reporting_date error_id;
37         if last.reporting_date=10/09/2018 then delete;
38         /*else new_reporting_date;*/
39         run;

NOTE: There were 5 observations read from the data set WORK.NEW_TEST.
NOTE: The data set WORK.TEMP has 5 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

May I know why the last record of 10/09/2018 is not getting deleted? Reporting _date is numeric variable with format ddmmyy10.

 

Desired Output:

 

Reporting_date Error_id
10/09/2018 12
11/09/2018 11
12/09/2018 13
13/09/2018 16

 

4 REPLIES 4
gamotte
Rhodochrosite | Level 12

Hello last.var is an automatic variable created by the "by var;" instruction.

last.var=1 if the observation is the last one for the current value of var, 0 otherwise.

 

It makes no sense to compare it to a date.

 

if last.reporting_date=10/09/2018

 

should be  (assuming reporting_date has a date format)

 

if last.reporting_date and reporting_date='10Sep2018'd

 

Babloo
Rhodochrosite | Level 12

It is not solving the issue.

 

6         data temp;
27         set test;
28         by reporting_date error_id;
29         if reporting_date and last.reporting_date='10sep2018'd then delete;
30         /*else new_reporting_date;*/
31         run;

NOTE: There were 7851 observations read from the data set WORK.TEST.
NOTE: The data set WORK.TEMP has 7851 observations and 12 variables.
gamotte
Rhodochrosite | Level 12

You are still comparing last.reporting_date, which is 0/1, to a date.

 

Try :

 

if last.reporting_date and reporting_date='10sep2018'd then delete;

 

Astounding
PROC Star

It seems you are describing only a small part of the problem.

 

  • What should happen if there are three observations for 10/09/2018 instead of two?
  • What should happen if some other date has three observations?

Given that you sort your data, here are a few ways to subset that you might want to use in the final DATA step:

 

if first.reporting_date;

Keeps just one observation per reporting date, whichever one is first.

 

if last.reporting_date and 0=first.reporting_date then delete;

Deletes the last observation per reporting date, as long as there is more than one.

 

The right solution depends on accurately describing the problem.  You may have done that, but I suspect you did it for just a small selection of the data.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 762 views
  • 0 likes
  • 3 in conversation