Re: Creating new by Keeping First record from multiple records without...

Hitesh · Posted 05-13-2021 11:31 AM

Hi Experts,

I have dataset loaded to CAS (Job runs every morning) which has structure (Sample data given below). I want to calculate days to first Activity for each ID and then aggregate these days (Average days) for each manager (Not in the sample data). To do that I need to create a new variable(Column) which keeps only First Activity_Date (First date) without creating new or aggregating dataset as I am aggregating later. Please advise or provide steps for Prepare-data level or Explore & Visualize level, either is fine.

Sample data:

ID	Created_Date	Activity_Date
A001	29-Jun-20	5-Jul-20
A001	29-Jun-20	8-Aug-20
A001	29-Jun-20	1-Jul-20
A001	29-Jun-20	15-Jan-21
A005	1-Jan-21
AD11	5-Aug-19	9-Feb-20
AD11	5-Aug-19	10-Feb-20
AD11	5-Aug-19	11-Feb-20
AD11	5-Aug-19	12-Feb-20
AD11	5-Aug-19	9-Jan-21
AD11	5-Aug-19	6-Sep-20
AD11	5-Aug-19	29-Feb-20
AD11	5-Aug-19	15-Sep-19
AD11	5-Aug-19	15-Sep-19
AD11	5-Aug-19	9-Jan-21
BC09	1-Jan-21	10-Apr-21
BC09	1-Jan-21	5-Feb-21
FDC1	10-May-21
XY01	3-Apr-20	23-May-21

Output Desired:

ID	Created_Date	Activity_Date	FirstActivityDate	DaysToFirstActivity
A001	29-Jun-20	1-Jul-20	1-Jul-20	2
A001	29-Jun-20	5-Jul-20
A001	29-Jun-20	8-Aug-20
A001	29-Jun-20	15-Jan-21
A005	1-Jan-21
AD11	5-Aug-19	15-Sep-19	15-Sep-19	41
AD11	5-Aug-19	15-Sep-19
AD11	5-Aug-19	9-Feb-20
AD11	5-Aug-19	10-Feb-20
AD11	5-Aug-19	11-Feb-20
AD11	5-Aug-19	12-Feb-20
AD11	5-Aug-19	29-Feb-20
AD11	5-Aug-19	6-Sep-20
AD11	5-Aug-19	9-Jan-21
AD11	5-Aug-19	9-Jan-21
BC09	1-Jan-21	5-Feb-21	5-Feb-21	35
BC09	1-Jan-21	10-Apr-21
FDC1	10-May-21
XY01	3-Apr-20	23-May-21	23-May-21	415

Thank you very much in advance.

-Hitesh

mkeintz · Posted 05-13-2021 02:06 PM

In a data step, this is simple. Assuming your data are grouped by id, then:

data want;
  set have;
  by id notsorted;
  if first.id then do;
    firstactivitydate=activitydate;
    daystofirstactivity=firstactivitydate-created_date;
  end;
  format firstactivitydate date9. ;
run;

The BY statement tells SAS to expect the data to be physically grouped by ID, and creates dummy variable (first.id) indicating whether the record in hand is the first for a given ID. That's the only time the DO statements kick in. All the other records will have missing values for the new variables.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Hitesh · Posted 05-13-2021 03:48 PM

Thanks for the quick reply. I used these lots for data-step in SAS. Here I am looking for SAS-ViYA (Without creating/aggregating new dataset and even without sorting) and not SAS..
Thank you,

Creating new by Keeping First record from multiple records without sort/merge or creating dataset

Re: Creating new by Keeping First record from multiple records without sort/merge or creating datase

Re: Creating new by Keeping First record from multiple records without sort/merge or creating datase

Creating new by Keeping First record from multiple records without sort/merge or creating dataset

Re: Creating new by Keeping First record from multiple records without sort/merge or creating datase

Re: Creating new by Keeping First record from multiple records without sort/merge or creating datase

SAS Innovate 2025: Call for Content