BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Nasser_DRMCP
Lapis Lazuli | Level 10
Data have_contract ;
input contract_number contract_version version_begin_date ;
informat version_begin_date ddmmyy10. ;
format version_begin_date ddmmyy10. ;
datalines ;
123456789 1 01/09/2021
123456789 2 02/10/2021
123456789 3 03/11/2021
run ;

data have_event ;
input contract_number event_date ;
informat event_date ddmmyy10. ;
format event_date ddmmyy10. ;
datalines;
123456789 01/09/2021
123456789 08/09/2021
123456789 03/10/2021
123456789 30/10/2021
123456789 03/11/2021
123456789 13/11/2021
123456789 23/11/2021
run ;


Data want ;
input contract_number contract_version version_begin_date event_date ;
informat version_begin_date event_date ddmmyy10. ;
format version_begin_date event_date ddmmyy10. ;
datalines ;
123456789 1 01/09/2021 01/09/2021
123456789 1 01/09/2021 08/09/2021
123456789 2 02/10/2021 03/10/2021
123456789 2 02/10/2021 30/10/2021
123456789 3 03/11/2021 03/11/2021
123456789 3 03/11/2021 13/11/2021
123456789 3 03/11/2021 23/11/2021
run ;

Hello,

I would like to join this 2 tables "have".

But I only want to join the events wich have an event date greater or equal the begin_date of the version and less than the the begin date of the next version.

thanks in advance for your help

kind regards

Nasser

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

This is a case of "bringing history forward".  I.e. you want to bring forward ("retain" in SAS language) the contract record through all subsequent event records, until the next contract record.  And you want one record per event.

 

For sorted data it's easy:

 

data want;
  set have_contract (in=inc keep=contract_number version_begin_date rename=(version_begin_date=event_date))
      have_event    (in=ine); 
  by contract_number event_date;
  if inc then set have_contract;
  if ine;
run;

Note this program depends a little on the data being relatively "clean".  That is, it assumes that no contract has an (erroneous) event date that precedes the first contract date.  If such a condition exists, then that event would inherit data from the preceding contract.

 

The "trick" here, is that all the variables that are only read in by the "if inc then set have_contract" are retained through all subsequent event records until the next have_contract date.   This in turn is why the first SET statement only keeps the two BY-variables from have_contract, thereby avoiding overwriting all the other have_contract variables.

 

Editted note: Yes, this can also be done via PROC SQL, but would likely require a good deal more code to satisfy an atavistic impulse.  And this technique is easily expanded to apply to more than two input datasets.  See History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies.  

 

 

 

@Nasser_DRMCP asked:

     is it possible to get all contracts even thoses without events ?

Yes.  Drop the 

if ine:

statement.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

4 REPLIES 4
Nasser_DRMCP
Lapis Lazuli | Level 10

hello

I succeeded to get the result I expected . first I sort my data by descending date .and then with LAG I could determine the version_begin_date_next.

and then with the filter below

and event_date>= a.version_begin_date
and b.event_date < a.version_bvegin_date_next

 

Nasser

mkeintz
PROC Star

This is a case of "bringing history forward".  I.e. you want to bring forward ("retain" in SAS language) the contract record through all subsequent event records, until the next contract record.  And you want one record per event.

 

For sorted data it's easy:

 

data want;
  set have_contract (in=inc keep=contract_number version_begin_date rename=(version_begin_date=event_date))
      have_event    (in=ine); 
  by contract_number event_date;
  if inc then set have_contract;
  if ine;
run;

Note this program depends a little on the data being relatively "clean".  That is, it assumes that no contract has an (erroneous) event date that precedes the first contract date.  If such a condition exists, then that event would inherit data from the preceding contract.

 

The "trick" here, is that all the variables that are only read in by the "if inc then set have_contract" are retained through all subsequent event records until the next have_contract date.   This in turn is why the first SET statement only keeps the two BY-variables from have_contract, thereby avoiding overwriting all the other have_contract variables.

 

Editted note: Yes, this can also be done via PROC SQL, but would likely require a good deal more code to satisfy an atavistic impulse.  And this technique is easily expanded to apply to more than two input datasets.  See History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies.  

 

 

 

@Nasser_DRMCP asked:

     is it possible to get all contracts even thoses without events ?

Yes.  Drop the 

if ine:

statement.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Nasser_DRMCP
Lapis Lazuli | Level 10

Hi mkeintz

many thanks for your help. I have an additional question.

I notice that if a contratc exists in the in have_contract table but not in the have_event table then this contract does not exist in the "want" table.

is it possible to get all contracts even thoses without events ?

thanks in advance

Nasser

Tommer
Obsidian | Level 7

/*First sort the have_contract and create a lag variable:*/

 

proc sort data=have_Contract; by descending version_begin_date; run;

data have_contract2;
set have_contract;
version_lag = lag(version_begin_date);
format version_lag ddmmyy10.;
run;

 

/*Then create a Cartesian product followed by filtering for what you need:*/


proc sql;
create table all1 as select a.*, b.event_date from have_contract2 a, have_event b where a.contract_number=b.contract_number
order by version_begin_date;
quit;

 

data all2;
set all1;
if version_lag ne . and version_begin_date=<event_date<version_lag then inc = 1;
else if version_lag eq . and version_begin_date=<event_date then inc =1;
if inc=1;
drop version_lag inc;
run;
proc sort data=all2; by contract_number contract_version version_begin_date event_date;

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 833 views
  • 2 likes
  • 3 in conversation