BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PatrykSAS
Obsidian | Level 7

Hi. I would like to calculate how many days on average pass between contact with the client in terms of snapshots. I will use the following example tables (The DIFF_DAYS column means how many days have passed from the previous date in the given id)

 

Table 1

ID      DATE                 DIFF_DAYS

1       2019-05-17         .

1       2019-10-06        142

1       2020-01-15        101

1       2020-02-23         39

2       2019-05-07         .

2       2019-08-20        105

2       2020-03-15        208

2       2020-04-10        26

 

And snapshot table (snapshots for each ID can be different)

Table 2

ID  SNAPSHOT_DATE

1    2019-12-31

1    2020-06-30

2    2020-03-31

2    2020-06-30

 

All I want to achieve is a table simiar to this (AVG_DIFF_DAYS is always the average from the first date to the last included in the snapshot date)

ID  SNAPSHOT_DATE    AVG_DIFF_DAYS

1    2019-12-31                71

1    2020-06-30                70,5

2    2020-03-31                104,33

2    2020-06-30                 84,75

 

The task is not very easy. How can I solve it?

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @PatrykSAS  Can i assume all ID's in table1 are present in table2 ? if yes, it's pretty straight forward-

 

 


data table1;
 input ID      DATE :yymmdd10.   DIFF_DAYS;
 format date yymmdd10.;
 cards;
1       2019-05-17         .

1       2019-10-06        142

1       2020-01-15        101

1       2020-02-23         39

2       2019-05-07         .

2       2019-08-20        105

2       2020-03-15        208

2       2020-04-10        26
;

data table2;
 input ID  SNAPSHOT_DATE :yymmdd10.;
 format snapshot_Date yymmdd10.;
 cards; 
1    2019-12-31
1    2020-06-30
2    2020-03-31
2    2020-06-30
;


data want;
 if _n_=1 then do;
  dcl hash H (multidata:'y') ;
   h.definekey  ("id") ;
   h.definedata ("date", "DIFF_DAYS") ;
   h.definedone () ;
 end;
 do _n_=h.clear() by 0 until(last.id);
  set table1;
  by id;
  h.add();
 end;
 do until(last.id);
  set table2;
  by id;
  call missing(_s,_n);
  do _n_=h.find() by 0 while(_n_=0);
   if date<=SNAPSHOT_DATE  then do;
	 _n=sum(_n,1);
	 _s=sum(_s,diff_days,0);
   end;
   _n_=h.find_next();
  end;
  AVG_DIFF_DAYS=divide(_s,_n);
  output;
 end;
 format AVG_DIFF_DAYS 8.2;
 keep id SNAPSHOT_DATE AVG_DIFF_DAYS;
run;

proc print noobs;run;
ID SNAPSHOT_DATE AVG_DIFF_DAYS
1 2019-12-31 71.00
1 2020-06-30 70.50
2 2020-03-31 104.33
2 2020-06-30 84.75

 

View solution in original post

3 REPLIES 3
novinosrin
Tourmaline | Level 20

Hi @PatrykSAS  Can i assume all ID's in table1 are present in table2 ? if yes, it's pretty straight forward-

 

 


data table1;
 input ID      DATE :yymmdd10.   DIFF_DAYS;
 format date yymmdd10.;
 cards;
1       2019-05-17         .

1       2019-10-06        142

1       2020-01-15        101

1       2020-02-23         39

2       2019-05-07         .

2       2019-08-20        105

2       2020-03-15        208

2       2020-04-10        26
;

data table2;
 input ID  SNAPSHOT_DATE :yymmdd10.;
 format snapshot_Date yymmdd10.;
 cards; 
1    2019-12-31
1    2020-06-30
2    2020-03-31
2    2020-06-30
;


data want;
 if _n_=1 then do;
  dcl hash H (multidata:'y') ;
   h.definekey  ("id") ;
   h.definedata ("date", "DIFF_DAYS") ;
   h.definedone () ;
 end;
 do _n_=h.clear() by 0 until(last.id);
  set table1;
  by id;
  h.add();
 end;
 do until(last.id);
  set table2;
  by id;
  call missing(_s,_n);
  do _n_=h.find() by 0 while(_n_=0);
   if date<=SNAPSHOT_DATE  then do;
	 _n=sum(_n,1);
	 _s=sum(_s,diff_days,0);
   end;
   _n_=h.find_next();
  end;
  AVG_DIFF_DAYS=divide(_s,_n);
  output;
 end;
 format AVG_DIFF_DAYS 8.2;
 keep id SNAPSHOT_DATE AVG_DIFF_DAYS;
run;

proc print noobs;run;
ID SNAPSHOT_DATE AVG_DIFF_DAYS
1 2019-12-31 71.00
1 2020-06-30 70.50
2 2020-03-31 104.33
2 2020-06-30 84.75

 

novinosrin
Tourmaline | Level 20

@PatrykSAS  Easy and boring SQL-

data table1;
 input ID      DATE :yymmdd10.   DIFF_DAYS;
 format date yymmdd10.;
 cards;
1       2019-05-17         .

1       2019-10-06        142

1       2020-01-15        101

1       2020-02-23         39

2       2019-05-07         .

2       2019-08-20        105

2       2020-03-15        208

2       2020-04-10        26
;

data table2;
 input ID  SNAPSHOT_DATE :yymmdd10.;
 format snapshot_Date yymmdd10.;
 cards; 
1    2019-12-31
1    2020-06-30
2    2020-03-31
2    2020-06-30
;

proc sql;
 create table want as
 select a.*, divide(sum(DIFF_DAYS),count(SNAPSHOT_DATE)) as AVG_DIFF_DAYS format=8.2
 from table2 a left join table1 b
 on a.id=b.id and b.date<=SNAPSHOT_DATE
 group by a.id,SNAPSHOT_DATE;
quit;
PatrykSAS
Obsidian | Level 7
Nice and easy. Thank you a lot!

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 838 views
  • 1 like
  • 2 in conversation