Hi Experts,
I have dataset loaded to CAS (Job runs every morning) which has structure (Sample data given below). I want to calculate days to first Activity for each ID and then aggregate these days (Average days) for each manager (Not in the sample data). To do that I need to create a new variable(Column) which keeps only First Activity_Date (First date) without creating new or aggregating dataset as I am aggregating later. Please advise or provide steps for Prepare-data level or Explore & Visualize level, either is fine.
Sample data:
| ID | Created_Date | Activity_Date |
| A001 | 29-Jun-20 | 5-Jul-20 |
| A001 | 29-Jun-20 | 8-Aug-20 |
| A001 | 29-Jun-20 | 1-Jul-20 |
| A001 | 29-Jun-20 | 15-Jan-21 |
| A005 | 1-Jan-21 | |
| AD11 | 5-Aug-19 | 9-Feb-20 |
| AD11 | 5-Aug-19 | 10-Feb-20 |
| AD11 | 5-Aug-19 | 11-Feb-20 |
| AD11 | 5-Aug-19 | 12-Feb-20 |
| AD11 | 5-Aug-19 | 9-Jan-21 |
| AD11 | 5-Aug-19 | 6-Sep-20 |
| AD11 | 5-Aug-19 | 29-Feb-20 |
| AD11 | 5-Aug-19 | 15-Sep-19 |
| AD11 | 5-Aug-19 | 15-Sep-19 |
| AD11 | 5-Aug-19 | 9-Jan-21 |
| BC09 | 1-Jan-21 | 10-Apr-21 |
| BC09 | 1-Jan-21 | 5-Feb-21 |
| FDC1 | 10-May-21 | |
| XY01 | 3-Apr-20 | 23-May-21 |
Output Desired:
| ID | Created_Date | Activity_Date | FirstActivityDate | DaysToFirstActivity |
| A001 | 29-Jun-20 | 1-Jul-20 | 1-Jul-20 | 2 |
| A001 | 29-Jun-20 | 5-Jul-20 | ||
| A001 | 29-Jun-20 | 8-Aug-20 | ||
| A001 | 29-Jun-20 | 15-Jan-21 | ||
| A005 | 1-Jan-21 | |||
| AD11 | 5-Aug-19 | 15-Sep-19 | 15-Sep-19 | 41 |
| AD11 | 5-Aug-19 | 15-Sep-19 | ||
| AD11 | 5-Aug-19 | 9-Feb-20 | ||
| AD11 | 5-Aug-19 | 10-Feb-20 | ||
| AD11 | 5-Aug-19 | 11-Feb-20 | ||
| AD11 | 5-Aug-19 | 12-Feb-20 | ||
| AD11 | 5-Aug-19 | 29-Feb-20 | ||
| AD11 | 5-Aug-19 | 6-Sep-20 | ||
| AD11 | 5-Aug-19 | 9-Jan-21 | ||
| AD11 | 5-Aug-19 | 9-Jan-21 | ||
| BC09 | 1-Jan-21 | 5-Feb-21 | 5-Feb-21 | 35 |
| BC09 | 1-Jan-21 | 10-Apr-21 | ||
| FDC1 | 10-May-21 | |||
| XY01 | 3-Apr-20 | 23-May-21 | 23-May-21 | 415 |
Thank you very much in advance.
-Hitesh
Only way that i think you will be able to do this is via code. I would use SAS STudio and connect to a cas session.
With datastep syntax below. There might be a different way but here is what i came up with.
cas mySession sessopts=(caslib=casuser timeout=1800 locale="en_US");
libname caspub cas caslib=public; /*libname to the CASLIB that contains data*/
proc sort data=caspub.book1 out=work.book1;
by ID Created_Date Activity_Date;
run;
data work.book1;
set work.book1;
by ID Created_Date Activity_Date;
if first.ID=1 then do; diff=Activity_Date-Created_Date; end;
run;
Hope this helps.
Carmine
Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.
Explore Now →