Hello
I have a data set with multiple rows for each customerID.
Each row represent data on other month.
I want to find the last month (The closest month to current month) that Customer moved from status 0 to status 1.
For example:
For customer 123456
2024-01 0
2024-02 0
2024-03 1
2024-04 0
2024-05 0
2024-06 0
2024-07 0
2024-08 1
2024-09 0
2024-10 0
2024-11 0
You can see that last month of change from status 0 to 1 is 2024-08
What is the way to find it by code?
Data have;
input CustID status
123 202401 0
123 202402 0
123 202403 1
123 202404 1
123 202405 0
123 202406 0
123 202407 1
123 202408 0
123 202409 0
123 202410 0
123 20411 0
;
Run;
data want;
set have;
by custid;
retain changemonth;
if first.custid then changemonth = .;
if not first.custid and lag(status) = 0 and status = 1 then changemonth = month;
if last.custid;
keep custid changemonth;
run;
Add a FORMAT statement if your month variable is a SAS date value.
Please proofread your code, and then run it yourself to make sure it works. Do NOT provide us sample data with code that doesn't work.
Use LAG to determine previous month status in a data step and create flag to indicate when the change happens, then use PROC SUMMARY to find the max month where this happens. I can provide examples once I have working code that creates the sample data.
What would you like for this abbreviated version of your data?
Data have;
input CustID status;
datalines;
123 202401 0
123 202402 0
123 202403 1
123 202404 1
123 202405 0
;
Please look at this and FIX WHICH DATE HAS STATUS 1 and WHICH STATUS 0.
Bad enough that that the data step 1) doesn't run 2) would never have status as 0 or 1 as you are reading the date as the Status variable.
But then you complicate things by changing the values of status for 2024-07 and 2024_08 so we aren't even sure which is correct for discussion.
Plus the Last "date" is incorrect as 20411 not 202411
Really should show what the output is supposed to look like and include expected output for cases where status is all 0 or all 1 for a custid or never changes from 0 to 1 for other reasons (only 0 is the status for the last sequential month for the customer )
For customer 123456
2024-01 0
2024-02 0
2024-03 1
2024-04 0
2024-05 0
2024-06 0
2024-07 0
2024-08 1
2024-09 0
2024-10 0
2024-11 0
You can see that last month of change from status 0 to 1 is 2024-08
What is the way to find it by code?
Data have; input CustID status 123 202401 0 123 202402 0 123 202403 1 123 202404 1 123 202405 0 123 202406 0 123 202407 1 123 202408 0 123 202409 0 123 202410 0 123 20411 0 ; Run;
Let's move the goal posts a little, and change the objective to keep the entire record (i.e. with possible hundreds of variables) that belongs to the last obs status=1 and immediately follows a status=0 (within the same custid of course).
Then this two-step (but only one pass of stored data) approach is neat and efficient:
data need / view=need;
set have;
by custid;
if first.id=0 and lag(status)=0 and status=1;
run;
data want;
set need;
by custid;
if last.custid;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.