Hi Experts,
I have dataset loaded to CAS (Job runs every morning) which has structure (Sample data given below). I want to calculate days to first Activity for each ID and then aggregate these days (Average days) for each manager (Not in the sample data). To do that I need to create a new variable(Column) which keeps only First Activity_Date (First date) without creating new or aggregating dataset as I am aggregating later. Please advise or provide steps for Prepare-data level or Explore & Visualize level, either is fine.
Sample data:
ID | Created_Date | Activity_Date |
A001 | 29-Jun-20 | 5-Jul-20 |
A001 | 29-Jun-20 | 8-Aug-20 |
A001 | 29-Jun-20 | 1-Jul-20 |
A001 | 29-Jun-20 | 15-Jan-21 |
A005 | 1-Jan-21 | |
AD11 | 5-Aug-19 | 9-Feb-20 |
AD11 | 5-Aug-19 | 10-Feb-20 |
AD11 | 5-Aug-19 | 11-Feb-20 |
AD11 | 5-Aug-19 | 12-Feb-20 |
AD11 | 5-Aug-19 | 9-Jan-21 |
AD11 | 5-Aug-19 | 6-Sep-20 |
AD11 | 5-Aug-19 | 29-Feb-20 |
AD11 | 5-Aug-19 | 15-Sep-19 |
AD11 | 5-Aug-19 | 15-Sep-19 |
AD11 | 5-Aug-19 | 9-Jan-21 |
BC09 | 1-Jan-21 | 10-Apr-21 |
BC09 | 1-Jan-21 | 5-Feb-21 |
FDC1 | 10-May-21 | |
XY01 | 3-Apr-20 | 23-May-21 |
Output Desired:
ID | Created_Date | Activity_Date | FirstActivityDate | DaysToFirstActivity |
A001 | 29-Jun-20 | 1-Jul-20 | 1-Jul-20 | 2 |
A001 | 29-Jun-20 | 5-Jul-20 | ||
A001 | 29-Jun-20 | 8-Aug-20 | ||
A001 | 29-Jun-20 | 15-Jan-21 | ||
A005 | 1-Jan-21 | |||
AD11 | 5-Aug-19 | 15-Sep-19 | 15-Sep-19 | 41 |
AD11 | 5-Aug-19 | 15-Sep-19 | ||
AD11 | 5-Aug-19 | 9-Feb-20 | ||
AD11 | 5-Aug-19 | 10-Feb-20 | ||
AD11 | 5-Aug-19 | 11-Feb-20 | ||
AD11 | 5-Aug-19 | 12-Feb-20 | ||
AD11 | 5-Aug-19 | 29-Feb-20 | ||
AD11 | 5-Aug-19 | 6-Sep-20 | ||
AD11 | 5-Aug-19 | 9-Jan-21 | ||
AD11 | 5-Aug-19 | 9-Jan-21 | ||
BC09 | 1-Jan-21 | 5-Feb-21 | 5-Feb-21 | 35 |
BC09 | 1-Jan-21 | 10-Apr-21 | ||
FDC1 | 10-May-21 | |||
XY01 | 3-Apr-20 | 23-May-21 | 23-May-21 | 415 |
Thank you very much in advance.
-Hitesh
In a data step, this is simple. Assuming your data are grouped by id, then:
data want;
set have;
by id notsorted;
if first.id then do;
firstactivitydate=activitydate;
daystofirstactivity=firstactivitydate-created_date;
end;
format firstactivitydate date9. ;
run;
The BY statement tells SAS to expect the data to be physically grouped by ID, and creates dummy variable (first.id) indicating whether the record in hand is the first for a given ID. That's the only time the DO statements kick in. All the other records will have missing values for the new variables.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.