I need to do some clean-up on a few dim tables in our warehouse. It looks like our two columns that have the 'valid from date' and 'valide to date', which allow us to find the current record in a set of historical records, were not loaded correctly some time ago.
For example, we may have a record like follows:
ID Valid from Valid to
1 01JUN2007 01JAN5999
1 01JAN2004 01JAN5999
or
4 01JUN2012 01JAN5999
4 16AUG2014 01JAN5999
4 01JAN2010 30JUL2012
4 01DEC2013 0!JAN5999
Clearly in each of these cases, only one record should have a 'Valid To' value of 01JAN5999....know we've got to fix it. I hope I made it clear in the example that it varies the number of records an ID may have.
I was thinking if I could sort the the columns and retain the 'Valid From' record of the previous row, subtract 1 from the date, then I could have the correct 'Valid To' date.
Can this be done? Am I on the right path, or is there another approach?
Thanks!
I think it sounds like a feasible approach.
Unless your data is huge. Then you might want o filter out the rows with high date, and possible whole id groups, fix the necessary records separately, and do a table update/modify by.
Just be aware that processing your data this way, indexes & constraints will vanish, so you need to apply them to your fixed data.
Thanks, but how do I retain the value from one row to then do a -1 day expression on the next to get the new close date?
Assuming the valid from dates are correct and you don't have the same value more than once per ID then below code gives you the logic for re-alignment.
data have;
infile datalines truncover;
input id (Valid_from_dttm Valid_to_dttm) (:date9.);
format Valid_from_dttm Valid_to_dttm date9.;
datalines;
1 01JUN2007 01JAN5999
1 01JAN2004 01JAN5999
4 01JUN2012 01JAN5999
4 16AUG2014 01JAN5999
4 01JAN2010 30JUL2012
4 01DEC2013 01JAN5999
;
run;
proc sort data=have out=have_sorted;
by id Valid_from_dttm;
run;
data updates;
set have_sorted;
by id;
format Valid_to_dttm_REALIGNED date9.;
if not last.id then
do;
_ind=_n_+1;
set want
(
keep=Valid_from_dttm
rename=(Valid_from_dttm=_Valid_from_dttm)
)
point=_ind
;
if Valid_to_dttm>_Valid_from_dttm then
do;
Valid_to_dttm_REALIGNED=_Valid_from_dttm-1;
output;
end;
end;
run;
If your data is stored in a data base and you'r dealing with large data volumes then I would implement the logic directly in-database (eg. using analytics functions).
What you can't re-construct are cases where a record had been expired but then a month later got re-activated (if that's possible with your data). So something like below couldn't get reconstructed:
1 01JUN2004 01DEC2006
1 01JAN2007 01JAN5999
Sounds like a "hack". So using this approach you didn't create new records? And it really re-aligned all values also historical ones?
It did not create any new records, but did realign the open and close dates for a given ID.
We also had a few cases where there would only be a singular record (ID) but was closed. This should be open so that each ID has an open record. It opened those cases with the '01JAN5999' date.
I feared it may create new records as well, but was pleased to see it didn't.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.