BookmarkSubscribeRSS Feed
Hitesh
Fluorite | Level 6

Hi Experts,

I have dataset loaded to CAS (Job runs every morning) which has structure (Sample data given below). I want to calculate days to first Activity for each ID and then aggregate these days (Average days) for each manager (Not in the sample data). To do that I need to create  a new variable(Column) which keeps only First Activity_Date (First date) without creating new or aggregating dataset as I am aggregating later. Please advise or provide steps for Prepare-data level or Explore & Visualize level, either is fine.

 

Sample data:

ID  Created_Date  Activity_Date
A001 29-Jun-20 5-Jul-20
A001 29-Jun-20 8-Aug-20
A001 29-Jun-20 1-Jul-20
A001 29-Jun-20 15-Jan-21
A005 1-Jan-21  
AD11 5-Aug-19 9-Feb-20
AD11 5-Aug-19 10-Feb-20
AD11 5-Aug-19 11-Feb-20
AD11 5-Aug-19 12-Feb-20
AD11 5-Aug-19 9-Jan-21
AD11 5-Aug-19 6-Sep-20
AD11 5-Aug-19 29-Feb-20
AD11 5-Aug-19 15-Sep-19
AD11 5-Aug-19 15-Sep-19
AD11 5-Aug-19 9-Jan-21
BC09 1-Jan-21 10-Apr-21
BC09 1-Jan-21 5-Feb-21
FDC1 10-May-21  
XY01 3-Apr-20 23-May-21

 

 

Output Desired:

ID  Created_Date  Activity_Date FirstActivityDate DaysToFirstActivity
A001 29-Jun-20 1-Jul-20 1-Jul-20 2
A001 29-Jun-20 5-Jul-20    
A001 29-Jun-20 8-Aug-20    
A001 29-Jun-20 15-Jan-21    
A005 1-Jan-21      
AD11 5-Aug-19 15-Sep-19 15-Sep-19 41
AD11 5-Aug-19 15-Sep-19    
AD11 5-Aug-19 9-Feb-20    
AD11 5-Aug-19 10-Feb-20    
AD11 5-Aug-19 11-Feb-20    
AD11 5-Aug-19 12-Feb-20    
AD11 5-Aug-19 29-Feb-20    
AD11 5-Aug-19 6-Sep-20    
AD11 5-Aug-19 9-Jan-21    
AD11 5-Aug-19 9-Jan-21    
BC09 1-Jan-21 5-Feb-21 5-Feb-21 35
BC09 1-Jan-21 10-Apr-21    
FDC1 10-May-21      
XY01 3-Apr-20 23-May-21 23-May-21 415

 

Thank you very much in advance.

-Hitesh

2 REPLIES 2
CarmineVerrell
SAS Employee

Only way that i think you will be able to do this is via code. I would use SAS STudio and connect to a cas session.

 

With datastep syntax below. There might be a different way but here is what i came up with.

 

cas mySession sessopts=(caslib=casuser timeout=1800 locale="en_US");

 

libname caspub cas caslib=public; /*libname to the CASLIB that contains data*/

 

proc sort data=caspub.book1 out=work.book1;
by ID Created_Date  Activity_Date;
run;

 

data work.book1;
    set work.book1;
    by ID Created_Date  Activity_Date;

if first.ID=1 then do; diff=Activity_Date-Created_Date; end;

run;

 

Hope this helps.

 

Carmine

Hitesh
Fluorite | Level 6
Thanks for the quick reply,
I check this out before posting my question. It is a kind of solution but it aggregates to ID level and creates new dataset. I dont want to aggregate to ID level but create new variable which retains only first activity date.. My dataset (Which I'm using) is also dependent on Report-level controls.
May be wait for some other solution/work around..