Hello everyone,
I have a sample dataset as below. I want to add new variable called “Period” on new dataset. If the DATE column includes ”DEC” value then I want to add “T” values on Period column by grouping ID column but we should consider that there can be many ‘DEC’ values on DATE. For this reason, latest date will be taken “T” value, second date will be taken “T_1”, third date will be taken “T_2” and so on. If DATE column includes different value from “DEC” we will write in Period column “T_Break” value. Actually, I created my desired output but It seems to me I extended the method. Can somebody show more easy method or give information me ,please ?
Data Have;
Length ID 8;
INFILE DATALINES MISSOVER;
INPUT ID Date;
INFORMAT Date Date9.;
FORMAT Date Date9.;
DATALINES;
100 31dec2011
102 31jul2014
100 31dec2012
100 31mar2014
101 31dec2015
101 31dec2013
101 31dec2014
100 31dec2013
102 31dec2010
102 31dec2013
102 31dec2011
102 31dec2012
;
RUN;
data Have2;
set Have;
if Month(date) ne 12 then Marker="BREAK";
run;
proc sort data=Have2;
by ID Marker descending date;
run;
data Have3;
set Have2;
by ID Marker;
retain Count;
if first.marker then Count=0;
else Count=Count+1;
if Count ne 0 then Period=catx("_","T",put(Count,1.));
else if Count eq 0 then Period="T";
if Marker="BREAK" then Period="T_BREAK";
run;
Thank you.
... View more