Comparing date within rows
I wish to check and compare dates within rows for a group of data. Also, it is possible to only have 1 row of record for 1 group.
data have;
POLICE='POL1';
END=0;
POLICE_EFFECTIVE_DATE='01Jan2024'd;
POLICE_END_DATE='31Dec2024'd;
output;
POLICE='POL1';
END=1;
POLICE_EFFECTIVE_DATE='01Jan2024'd;
POLICE_END_DATE='31Dec2024'd;
output;
POLICE='POL1';
END=2;
POLICE_EFFECTIVE_DATE='01Jan2024'd;
POLICE_END_DATE='31Jan2025'd;
output;
run;
POLICE='POL2';
END=0;
POLICE_EFFECTIVE_DATE='01Jan2024'd;
POLICE_END_DATE='31Dec2024'd;
output;
POLICE='POL3;
END=0;
POLICE_EFFECTIVE_DATE='01Jan2024'd;
POLICE_END_DATE='31Dec2024'd;
output;
POLICE='POL3;
END=1;
POLICE_EFFECTIVE_DATE='01Jan2024'd;
POLICE_END_DATE='31Dec2024'd;
output;
POLICE='POL3;
END=2;
POLICE_EFFECTIVE_DATE='01Feb2024'd;
POLICE_END_DATE='31Dec2025'd;
output;
It will look like this:
POLICE | END | POLICE_EFFECTIVE_DATE | POLICE_END_DATE |
POL1 | 0 | 1/1/2024 | 31/12/2024 |
POL1 | 1 | 1/1/2024 | 31/12/2024 |
POL1 | 2 | 1/1/2024 | 31/1/2025 |
POL2 | 0 | 1/1/2024 | 31/12/2024 |
POL3 | 0 | 1/1/2024 | 31/12/2024 |
POL3 | 1 | 1/1/2024 | 31/12/2024 |
POL3 | 2 | 1/2/2024 | 31/1/2025 |
What i want to achieve is this:
POLICE | END | POLICE_EFFECTIVE_DATE | POLICE_END_DATE | FLAG |
POL1 | 0 | 1/1/2024 | 31/12/2024 | |
POL1 | 1 | 1/1/2024 | 31/12/2024 | |
POL1 | 2 | 1/1/2024 | 31/1/2025 | EXTENDED |
POL2 | 0 | 1/1/2024 | 31/12/2024 | |
POL3 | 0 | 1/1/2024 | 31/12/2024 | |
POL3 | 1 | 1/1/2024 | 31/12/2024 | |
POL3 | 2 | 1/2/2024 | 31/1/2025 | SHIFTED |
From what i can tell, ""FLAG" can be defined by looking at the last.END POLICE_EFFECTIVE_DATE and POLICE_END_DATE.
If last.END POLICE_EFFECTIVE_DATE is the same as first.POLICE_EFFECTIVE_DATE
BUT
last.POLICE_END_DATE is greater than first.POLICE_END_DATE then last.flag = 'EXTENDED'.
If last.END POLICE_EFFECTIVE_DATE is greater than first.POLICE_EFFECTIVE_DATE
AND
last.POLICE_END_DATE is greater than first.POLICE_END_DATE then last.flag = 'SHIFTED'.
i can think of first sorting the dataset by POLICE and END.
Then, without a complex loop, is there a way for me to compare the date based on POLICE and END and check if it is EXTENDED OR SHIFTED or not.
From what I understand your pseudo code was already very close to actual SAS syntax. You just need to retain the values of the first row per group so you can compare it when you've reached the last row of the group.
data have;
input POLICE $ END POLICE_EFFECTIVE_DATE :date9. POLICE_END_DATE :date9.;
format POLICE_EFFECTIVE_DATE POLICE_END_DATE date9.;
datalines;
POL1 0 01JAN2024 31DEC2024
POL1 1 01JAN2024 31DEC2024
POL1 2 01JAN2024 31JAN2025
POL2 0 01JAN2024 31DEC2024
POL3 0 01JAN2024 31DEC2024
POL3 1 01JAN2024 31DEC2024
POL3 2 01FEB2024 31DEC2025
;
run;
/* proc sort data=have; */
/* by POLICE POLICE_EFFECTIVE_DATE POLICE_END_DATE; */
/* run; */
data want(drop=first_:);
set have;
by POLICE POLICE_EFFECTIVE_DATE POLICE_END_DATE;
retain first_POLICE_EFFECTIVE_DATE first_POLICE_END_DATE;
length type $8;
if first.POLICE then
do;
first_POLICE_EFFECTIVE_DATE =POLICE_EFFECTIVE_DATE;
first_POLICE_END_DATE =POLICE_END_DATE;
end;
if last.POLICE then
do;
if POLICE_EFFECTIVE_DATE = first_POLICE_EFFECTIVE_DATE
and
POLICE_END_DATE > first_POLICE_END_DATE
then type='EXTENDED';
else
If POLICE_EFFECTIVE_DATE > first_POLICE_EFFECTIVE_DATE
and
POLICE_END_DATE > first_POLICE_END_DATE
then type = 'SHIFTED';
end;
run;
proc print data=want;
run;
The data step creating have has errors: unclosed quotes and a run in the middle of the step. The format statement is missing, too.
Saving the first values of police_effective_date and police_end_date in retained variables and using them to set flag when the data step reached to last obs for police seem to create what you want:
data want;
set have;
by police;
length
flag $ 10
first_eff_date first_end_date 8
;
retain first_:;
if first.police then do;
first_eff_date = police_effective_date;
first_end_date = police_end_date;
end;
if last.police then do;
if police_end_date > first_end_date then do;
if police_effective_date = first_eff_date then flag = "EXTENDED";
if police_effective_date > first_eff_date then flag = "SHIFTED";
end;
end;
drop first_:;
run;
Assuming the data is sorted by police and end.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.