BookmarkSubscribeRSS Feed
Hitesh
Fluorite | Level 6

Hi Experts,

I have dataset loaded to CAS (Job runs every morning) which has structure (Sample data given below). I want to calculate days to first Activity for each ID and then aggregate these days (Average days) for each manager (Not in the sample data). To do that I need to create  a new variable(Column) which keeps only First Activity_Date (First date) without creating new or aggregating dataset as I am aggregating later. Please advise or provide steps for Prepare-data level or Explore & Visualize level, either is fine.

 

Sample data:

ID  Created_Date  Activity_Date
A001 29-Jun-20 5-Jul-20
A001 29-Jun-20 8-Aug-20
A001 29-Jun-20 1-Jul-20
A001 29-Jun-20 15-Jan-21
A005 1-Jan-21  
AD11 5-Aug-19 9-Feb-20
AD11 5-Aug-19 10-Feb-20
AD11 5-Aug-19 11-Feb-20
AD11 5-Aug-19 12-Feb-20
AD11 5-Aug-19 9-Jan-21
AD11 5-Aug-19 6-Sep-20
AD11 5-Aug-19 29-Feb-20
AD11 5-Aug-19 15-Sep-19
AD11 5-Aug-19 15-Sep-19
AD11 5-Aug-19 9-Jan-21
BC09 1-Jan-21 10-Apr-21
BC09 1-Jan-21 5-Feb-21
FDC1 10-May-21  
XY01 3-Apr-20 23-May-21

 

 

Output Desired:

ID  Created_Date  Activity_Date FirstActivityDate DaysToFirstActivity
A001 29-Jun-20 1-Jul-20 1-Jul-20 2
A001 29-Jun-20 5-Jul-20    
A001 29-Jun-20 8-Aug-20    
A001 29-Jun-20 15-Jan-21    
A005 1-Jan-21      
AD11 5-Aug-19 15-Sep-19 15-Sep-19 41
AD11 5-Aug-19 15-Sep-19    
AD11 5-Aug-19 9-Feb-20    
AD11 5-Aug-19 10-Feb-20    
AD11 5-Aug-19 11-Feb-20    
AD11 5-Aug-19 12-Feb-20    
AD11 5-Aug-19 29-Feb-20    
AD11 5-Aug-19 6-Sep-20    
AD11 5-Aug-19 9-Jan-21    
AD11 5-Aug-19 9-Jan-21    
BC09 1-Jan-21 5-Feb-21 5-Feb-21 35
BC09 1-Jan-21 10-Apr-21    
FDC1 10-May-21      
XY01 3-Apr-20 23-May-21 23-May-21 415

 

Thank you very much in advance.

-Hitesh

2 REPLIES 2
mkeintz
PROC Star

In a data step, this is simple.  Assuming your data are grouped by id, then:

 

data want;
  set have;
  by id notsorted;
  if first.id then do;
    firstactivitydate=activitydate;
    daystofirstactivity=firstactivitydate-created_date;
  end;
  format firstactivitydate date9. ;
run;

The BY statement tells SAS to expect the data to be physically grouped by ID, and creates dummy variable (first.id) indicating whether the record in hand is the first for a given ID.   That's the only time the DO statements kick in.  All the other records will have missing values for the new variables.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Hitesh
Fluorite | Level 6
Thanks for the quick reply. I used these lots for data-step in SAS. Here I am looking for SAS-ViYA (Without creating/aggregating new dataset and even without sorting) and not SAS..
Thank you,

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 737 views
  • 3 likes
  • 2 in conversation