Hi Experts,
I have dataset loaded to CAS (Job runs every morning) which has structure (Sample data given below). I want to calculate days to first Activity for each ID and then aggregate these days (Average days) for each manager (Not in the sample data). To do that I need to create a new variable(Column) which keeps only First Activity_Date (First date) without creating new or aggregating dataset as I am aggregating later. Please advise or provide steps for Prepare-data level or Explore & Visualize level, either is fine.
Sample data:
ID | Created_Date | Activity_Date |
A001 | 29-Jun-20 | 5-Jul-20 |
A001 | 29-Jun-20 | 8-Aug-20 |
A001 | 29-Jun-20 | 1-Jul-20 |
A001 | 29-Jun-20 | 15-Jan-21 |
A005 | 1-Jan-21 | |
AD11 | 5-Aug-19 | 9-Feb-20 |
AD11 | 5-Aug-19 | 10-Feb-20 |
AD11 | 5-Aug-19 | 11-Feb-20 |
AD11 | 5-Aug-19 | 12-Feb-20 |
AD11 | 5-Aug-19 | 9-Jan-21 |
AD11 | 5-Aug-19 | 6-Sep-20 |
AD11 | 5-Aug-19 | 29-Feb-20 |
AD11 | 5-Aug-19 | 15-Sep-19 |
AD11 | 5-Aug-19 | 15-Sep-19 |
AD11 | 5-Aug-19 | 9-Jan-21 |
BC09 | 1-Jan-21 | 10-Apr-21 |
BC09 | 1-Jan-21 | 5-Feb-21 |
FDC1 | 10-May-21 | |
XY01 | 3-Apr-20 | 23-May-21 |
Output Desired:
ID | Created_Date | Activity_Date | FirstActivityDate | DaysToFirstActivity |
A001 | 29-Jun-20 | 1-Jul-20 | 1-Jul-20 | 2 |
A001 | 29-Jun-20 | 5-Jul-20 | ||
A001 | 29-Jun-20 | 8-Aug-20 | ||
A001 | 29-Jun-20 | 15-Jan-21 | ||
A005 | 1-Jan-21 | |||
AD11 | 5-Aug-19 | 15-Sep-19 | 15-Sep-19 | 41 |
AD11 | 5-Aug-19 | 15-Sep-19 | ||
AD11 | 5-Aug-19 | 9-Feb-20 | ||
AD11 | 5-Aug-19 | 10-Feb-20 | ||
AD11 | 5-Aug-19 | 11-Feb-20 | ||
AD11 | 5-Aug-19 | 12-Feb-20 | ||
AD11 | 5-Aug-19 | 29-Feb-20 | ||
AD11 | 5-Aug-19 | 6-Sep-20 | ||
AD11 | 5-Aug-19 | 9-Jan-21 | ||
AD11 | 5-Aug-19 | 9-Jan-21 | ||
BC09 | 1-Jan-21 | 5-Feb-21 | 5-Feb-21 | 35 |
BC09 | 1-Jan-21 | 10-Apr-21 | ||
FDC1 | 10-May-21 | |||
XY01 | 3-Apr-20 | 23-May-21 | 23-May-21 | 415 |
Thank you very much in advance.
-Hitesh
In a data step, this is simple. Assuming your data are grouped by id, then:
data want;
set have;
by id notsorted;
if first.id then do;
firstactivitydate=activitydate;
daystofirstactivity=firstactivitydate-created_date;
end;
format firstactivitydate date9. ;
run;
The BY statement tells SAS to expect the data to be physically grouped by ID, and creates dummy variable (first.id) indicating whether the record in hand is the first for a given ID. That's the only time the DO statements kick in. All the other records will have missing values for the new variables.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.