Hello
I have a data set with 3 columns: customer ID , month (In structure YYMM), Y.
As you can see Each customer ID has multiple rows (From 2101 till 2108).
For each ID I want to calculate the first month when he left (name of column will be Left_Month).
Expected results are:
FOR ID=1 Left_Month=. (Null) because since he started he didnt left
FOR ID=2 Left_Month=. (Null) because since he started he didnt left
FOR ID=3 Left_Month=2104 Because it is first month he left
FOR ID=4 Left_Month=2103 Because it is the first month he left (Even though he came back again)
What is the way to calculate it please?
I am looking for a new data set with 4 rows with 2 columns (ID,Left_Month)
Thanks
Data have;
INPUT ID month Y;
cards;
1 2101 .
1 2102 .
1 2103 .
1 2104 10
1 2105 20
1 2106 30
1 2107 40
1 2108 50
2 2101 10
2 2102 15
2 2103 20
2 2104 30
2 2105 20
2 2106 50
2 2107 50
2 2108 40
3 2101 10
3 2102 15
3 2103 20
3 2104 .
3 2105 .
3 2106 .
3 2107 .
3 2108 .
4 2101 10
4 2102 15
4 2103 20
4 2104 .
4 2105 .
4 2106 30
4 2107 20
4 2108 15
;
Run;
Technically , IF I create Help column with value 1/0 (1-IF value Y exist ,0 IF value Y doesnt exist)
then I am looking for the month in the first row of 0 after start of 1
Examples to patterns and the location of the Left_Month in yellow color
ID 4 has a non-missing value in 2103, so I take it that month_left should be 2104?
Anyway, this code follows your description:
data want;
set have;
by id;
retain month_left was_in;
if first.id
then do;
was_in = 0;
month_left = " ";
end;
if y ne . then was_in = 1;
if month_left = "" and was_in and y = . then month_left = month;
if last.id;
keep id month_left;
run;
This is my solution in a long way code...
Data have2;
set have;
IF y=. then Ind_Null='1'; else Ind_Null='0';
Help=CATX('-',input(ID,best.),Ind_Null);
row=_n_;
Run;
proc sort data=have2;
by Help;
Run;
Data have3;
set have2;
by Help;
IF first.Help then Ind=1;
IF Y=. then Ind=.;
Run;
proc sort data=have3;
by ID row;
Run;
Data have4;
set have3;
By ID;
Retain Accum_Ind;
IF first.ID then Accum_Ind=Ind;
Accum_Ind=sum(Accum_Ind,Ind);
Run;
/***Take rows from start of follw up only***/
Data have4 (WHERE=(Accum_Ind ne .));
set have3;
By ID;
if first.ID then Accum_Ind=Ind;
else Accum_Ind + Ind;
run;
PROC SQL;
create table have5 as
select ID,
min(month) as Left_Month
from have4
where Y is null
group by ID
;
QUIT;
PROC SQL;
create table wanted as
select a.ID,coalesce(b.Left_Month,.) as Left_Month
from (select distinct ID from have) as a
left join have5 as b
on a.ID=b.ID
;
QUIT;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.