BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jwhite
Quartz | Level 8

I need to do some clean-up on a few dim tables in our warehouse. It looks like our two columns that have the 'valid from date' and 'valide to date', which allow us to find the current record in a set of historical records, were not loaded correctly some time ago.

 

For example, we may have a record like follows:

 

ID      Valid from          Valid to

1        01JUN2007        01JAN5999

1        01JAN2004        01JAN5999

 

or

 

4        01JUN2012        01JAN5999

4        16AUG2014       01JAN5999

4        01JAN2010        30JUL2012

4        01DEC2013       0!JAN5999

 

Clearly in each of these cases, only one record should have a 'Valid To' value of 01JAN5999....know we've got to fix it. I hope I made it clear in the example that it varies the number of records an ID may have.

 

I was thinking if I could sort the the columns and retain the 'Valid From' record of the previous row, subtract 1 from the date, then I could have the correct 'Valid To' date.

 

Can this be done? Am I on the right path, or is there another approach?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
LinusH
Tourmaline | Level 20
Resort your data by Id descending valid_from_dt.
Use retain for bot from ant to dt.
And then assign a new to dt with "previous" from dt-1 when appropriate.
Another crazy idea is the set all high dates to missing and then process the data through the SCD Type 2 Loader.
Data never sleeps

View solution in original post

7 REPLIES 7
LinusH
Tourmaline | Level 20

I think it sounds like a feasible approach.

Unless your data is huge. Then you might want o filter out the rows with high date, and possible whole id groups, fix the necessary records separately, and do a table update/modify by.

Just be aware that processing your data this way, indexes & constraints will vanish, so you need to apply them to your fixed data.

Data never sleeps
jwhite
Quartz | Level 8

Thanks, but how do I retain the value from one row to then do a -1 day expression on the next to get the new close date?

Patrick
Opal | Level 21

Assuming the valid from dates are correct and you don't have the same value more than once per ID then below code gives you the logic for re-alignment.

data have;
  infile datalines truncover;
  input id (Valid_from_dttm Valid_to_dttm) (:date9.);
  format Valid_from_dttm Valid_to_dttm date9.;
  datalines;
1 01JUN2007 01JAN5999
1 01JAN2004 01JAN5999
4 01JUN2012 01JAN5999
4 16AUG2014 01JAN5999
4 01JAN2010 30JUL2012
4 01DEC2013 01JAN5999
;
run;

proc sort data=have out=have_sorted;
  by id Valid_from_dttm;
run;

data updates;
  set have_sorted;
  by id;
  format Valid_to_dttm_REALIGNED date9.;
  if not last.id then
    do;
      _ind=_n_+1;
      set want 
        ( 
        keep=Valid_from_dttm 
        rename=(Valid_from_dttm=_Valid_from_dttm)
        )
        point=_ind
        ;

      if Valid_to_dttm>_Valid_from_dttm then 
        do;
          Valid_to_dttm_REALIGNED=_Valid_from_dttm-1;
          output;
        end;
    end;
run;

If your data is stored in a data base and you'r dealing with large data volumes then I would implement the logic directly in-database (eg. using analytics functions).

 

What you can't re-construct are cases where a record had been expired but then a month later got re-activated (if that's possible with your data). So something like below couldn't get reconstructed:

1 01JUN2004 01DEC2006
1 01JAN2007 01JAN5999

 

LinusH
Tourmaline | Level 20
Resort your data by Id descending valid_from_dt.
Use retain for bot from ant to dt.
And then assign a new to dt with "previous" from dt-1 when appropriate.
Another crazy idea is the set all high dates to missing and then process the data through the SCD Type 2 Loader.
Data never sleeps
jwhite
Quartz | Level 8

Thanks @Patrick and @LinusH!

 

Actually blanking out all the values in that value_to_date column, then running through the SCD Type 2 loader did the trick.

 

Cheers!

Patrick
Opal | Level 21

Sounds like a "hack". So using this approach you didn't create new records? And it really re-aligned all values also historical ones?

jwhite
Quartz | Level 8

It did not create any new records, but did realign the open and close dates for a given ID.

 

We also had a few cases where there would only be a singular record (ID) but was closed. This should be open so that each ID has an open record. It opened those cases with the '01JAN5999' date.

 

I feared it may create new records as well, but was pleased to see it didn't.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 2188 views
  • 1 like
  • 3 in conversation