hello,
I have data that looks like this
data have;
input dated :date9. accid _f;
format dated date9.;
datalines;
01JAN2023 123 1
01FEB2023 123 1
01MAR2023 123 0
01APR2023 123 1
01MAY2023 123 0
;
run;
there will be 1 record per account per month and the dataset has history going back a couple of years
I need to add an additional column which states the date on which the event started (_f). for the above, the want would look like this
data have;
input dated :date9. accid _f start :date9.;
format dated date9. start date9.;
datalines;
01JAN2023 123 1 01JAN2023
01FEB2023 123 1 01JAN2023
01MAR2023 123 0 .
01APR2023 123 1 01APR2023
01MAY2023 123 0 .
;
run;
the start date would be the first time _f =1 and that should remain the same for each consecutive month where _f=1. The start date would be 'reset' when _f = 0.
Any help on how to achieve this would be very much appreciated
Thanks
Thank you for providing a working data step with data.
One way:
data want; set have; by accid _f notsorted; retain start; format start date9.; If first._f and _f=1 then start=dated; else if first._f and _f=0 then call missing(start); run;
RETAIN tells SAS to keep values of the variable across the data step boundary.
The BY statement creates automatic variables First.<variable> and Last.<variable> for each variable on the by statement. These values are numeric 1 (true) and 0 (false) indicating where the current observation is the first or last of a group of variables. Normally BY would require the data to be sorted by the option NOTsorted allows the data to have the property that _F does of increase/decrease repeatedly across observations. We include the account so that if the _f from a previous account is 1 and the first _f for this account is 1 they are treated as separate groups (within account).
Then simple tests of when to set/reset Start based on the _f.
The First and Last variables are not written to the data set.
Thank you for providing a working data step with data.
One way:
data want; set have; by accid _f notsorted; retain start; format start date9.; If first._f and _f=1 then start=dated; else if first._f and _f=0 then call missing(start); run;
RETAIN tells SAS to keep values of the variable across the data step boundary.
The BY statement creates automatic variables First.<variable> and Last.<variable> for each variable on the by statement. These values are numeric 1 (true) and 0 (false) indicating where the current observation is the first or last of a group of variables. Normally BY would require the data to be sorted by the option NOTsorted allows the data to have the property that _F does of increase/decrease repeatedly across observations. We include the account so that if the _f from a previous account is 1 and the first _f for this account is 1 they are treated as separate groups (within account).
Then simple tests of when to set/reset Start based on the _f.
The First and Last variables are not written to the data set.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.