I've a data as follows as I want to delete the last record of the reporting date 10/09/2018.
Input data:
Reporting_date | Error_id |
10/09/2018 | 12 |
10/09/2018 | 24 |
11/09/2018 | 11 |
12/09/2018 | 13 |
13/09/2018 | 16 |
Program:
proc sql;
create table test as select * from dex.FUNCTIONAL_ERRORS;
quit;
proc sort data=test;
by reporting_date error_id;
run;
data temp;
set test;
by reporting_date error_id;
if last.reporting_date=10/09/2018 then delete;
/*else new_reporting_date;*/
run;
Extract of log:
33
34 data temp;
35 set test;
36 by reporting_date error_id;
37 if last.reporting_date=10/09/2018 then delete;
38 /*else new_reporting_date;*/
39 run;
NOTE: There were 5 observations read from the data set WORK.NEW_TEST.
NOTE: The data set WORK.TEMP has 5 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
May I know why the last record of 10/09/2018 is not getting deleted? Reporting _date is numeric variable with format ddmmyy10.
Desired Output:
Reporting_date | Error_id |
10/09/2018 | 12 |
11/09/2018 | 11 |
12/09/2018 | 13 |
13/09/2018 | 16 |
Hello last.var is an automatic variable created by the "by var;" instruction.
last.var=1 if the observation is the last one for the current value of var, 0 otherwise.
It makes no sense to compare it to a date.
if last.reporting_date=10/09/2018
should be (assuming reporting_date has a date format)
if last.reporting_date and reporting_date='10Sep2018'd
It is not solving the issue.
6 data temp;
27 set test;
28 by reporting_date error_id;
29 if reporting_date and last.reporting_date='10sep2018'd then delete;
30 /*else new_reporting_date;*/
31 run;
NOTE: There were 7851 observations read from the data set WORK.TEST.
NOTE: The data set WORK.TEMP has 7851 observations and 12 variables.
You are still comparing last.reporting_date, which is 0/1, to a date.
Try :
if last.reporting_date and reporting_date='10sep2018'd then delete;
It seems you are describing only a small part of the problem.
Given that you sort your data, here are a few ways to subset that you might want to use in the final DATA step:
if first.reporting_date;
Keeps just one observation per reporting date, whichever one is first.
if last.reporting_date and 0=first.reporting_date then delete;
Deletes the last observation per reporting date, as long as there is more than one.
The right solution depends on accurately describing the problem. You may have done that, but I suspect you did it for just a small selection of the data.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.