hello,
I have data that looks like this
data have;
input dated :date9. accid _f;
format dated date9.;
datalines;
01JAN2023 123 1
01FEB2023 123 1
01MAR2023 123 0
01APR2023 123 1
01MAY2023 123 0
;
run;
there will be 1 record per account per month and the dataset has history going back a couple of years
I need to add an additional column which states the date on which the event started (_f). for the above, the want would look like this
data have;
input dated :date9. accid _f start :date9.;
format dated date9. start date9.;
datalines;
01JAN2023 123 1 01JAN2023
01FEB2023 123 1 01JAN2023
01MAR2023 123 0 .
01APR2023 123 1 01APR2023
01MAY2023 123 0 .
;
run;
the start date would be the first time _f =1 and that should remain the same for each consecutive month where _f=1. The start date would be 'reset' when _f = 0.
Any help on how to achieve this would be very much appreciated
Thanks
Thank you for providing a working data step with data.
One way:
data want; set have; by accid _f notsorted; retain start; format start date9.; If first._f and _f=1 then start=dated; else if first._f and _f=0 then call missing(start); run;
RETAIN tells SAS to keep values of the variable across the data step boundary.
The BY statement creates automatic variables First.<variable> and Last.<variable> for each variable on the by statement. These values are numeric 1 (true) and 0 (false) indicating where the current observation is the first or last of a group of variables. Normally BY would require the data to be sorted by the option NOTsorted allows the data to have the property that _F does of increase/decrease repeatedly across observations. We include the account so that if the _f from a previous account is 1 and the first _f for this account is 1 they are treated as separate groups (within account).
Then simple tests of when to set/reset Start based on the _f.
The First and Last variables are not written to the data set.
Thank you for providing a working data step with data.
One way:
data want; set have; by accid _f notsorted; retain start; format start date9.; If first._f and _f=1 then start=dated; else if first._f and _f=0 then call missing(start); run;
RETAIN tells SAS to keep values of the variable across the data step boundary.
The BY statement creates automatic variables First.<variable> and Last.<variable> for each variable on the by statement. These values are numeric 1 (true) and 0 (false) indicating where the current observation is the first or last of a group of variables. Normally BY would require the data to be sorted by the option NOTsorted allows the data to have the property that _F does of increase/decrease repeatedly across observations. We include the account so that if the _f from a previous account is 1 and the first _f for this account is 1 they are treated as separate groups (within account).
Then simple tests of when to set/reset Start based on the _f.
The First and Last variables are not written to the data set.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.