Try this:
data have;
input id :$5. stage date :date9.;
format date yymmddd10.;
idvar = put(date,monname.);
datalines;
22313 2 31MAY2019
22314 1 31MAY2019
22315 3 31MAY2019
22316 3 31MAY2019
22317 1 31MAY2019
22318 2 31MAY2019
22313 2 30JUN2019
22314 1 30JUN2019
22315 3 30JUN2019
22316 3 30JUN2019
22317 1 30JUN2019
22318 2 30JUN2019
22313 2 31JUL2019
22314 2 31JUL2019
22315 3 31JUL2019
22316 2 31JUL2019
22317 1 31JUL2019
22318 2 31JUL2019
22313 1 31AUG2019
22314 2 31AUG2019
22315 2 31AUG2019
22316 2 31AUG2019
22317 1 31AUG2019
22318 2 31AUG2019
22313 1 30SEP2019
22314 1 30SEP2019
22315 2 30SEP2019
22316 2 30SEP2019
22317 1 30SEP2019
22318 2 30SEP2019
22313 1 31OCT2019
22314 1 31OCT2019
22315 2 31OCT2019
22316 2 31OCT2019
22317 2 31OCT2019
22318 2 31OCT2019
;
proc sort data=have;
by id date;
run;
data getmonth;
set have;
by id;
retain change month;
length month $9;
l_stage = lag(stage);
if first.id
then do;
change = 0;
month = '';
end;
else if l_stage ne stage then change = 1;
if stage in (2,3) and not last.id and (l_stage not in (2,3) or first.id) then month = idvar;
if last.id;
if change = 0 then month = '';
keep id month;
run;
proc transpose data=have out=trans (drop=_name_);
by id;
var stage;
id idvar;
run;
data want;
merge
trans
getmonth
;
by id;
run;
proc print data=want noobs;
run;
Result:
id | May | June | July | August | September | October | month |
---|---|---|---|---|---|---|---|
22313 | 2 | 2 | 2 | 1 | 1 | 1 | May |
22314 | 1 | 1 | 2 | 2 | 1 | 1 | July |
22315 | 3 | 3 | 3 | 2 | 2 | 2 | May |
22316 | 3 | 3 | 2 | 2 | 2 | 2 | May |
22317 | 1 | 1 | 1 | 1 | 1 | 2 | |
22318 | 2 | 2 | 2 | 2 | 2 | 2 |
i am sorry got error:
28 length month $9;
ERROR: Character length cannot be used with numeric variable Month.
sorry i modified the variable names now you code is working fine, unfortunately it is giving me the same results what i got earlier
results what i got from you program
e.g., 1 2 2 2 3 2 June (this should be September Month)
1 1 1 1 2 2 September (Month should be blank but we are getting September)
even this should be blank but we got June month
1 2 2 2 2 2 June
@kishoresamineni wrote:
sorry i modified the variable names now you code is working fine, unfortunately it is giving me the same results what i got earlier
results what i got from you program
e.g., 1 2 2 2 3 2 June (this should be September Month)
1 1 1 1 2 2 September (Month should be blank but we are getting September)
Why? In both cases, the code catches the change from 1 to 2.
If your rules are different, then PLEASE CLEARLY AND COMPLETELY STATE THEM, because I have invested a lot of time already extracting information bit by bit from your nose.
i do apologise for the confusion:
1 2 2 2 3 2 June (this should be September Month)
1 1 1 1 2 2 September (Month should be blank but we are getting September)
In both cases, the code catches the change from 1 to 2.
Considering only months may to september. If all states are constant 1 1 1 1 1 or 2 2 2 2 2, then the month should be blank.
If not, then the month should be the month where the state has the highest value. If there are multiple states that has the same value, month should be the first month where this value occurs.
Still does not match up:
(quote)
1 1 1 1 2 2 September (Month should be blank but we are getting September)
The change DOES happen in the last month of consideration (September), so here, according to your rules, it should be set. But you want it blank, for no reason apparent to me.
It seems now that you don't yet have a clue what you want at this moment, so I'm outta here for the time being.
Thank you so much for your valuable time on this topic:
1 2 2 2 3 2 June (this should be September Month)
1 1 1 1 2 2 September (Month should be blank but we are getting September) this is what happens in excel formula, s
o eventually i am trying to implement the same logic in SAS as well.
That's the reason i am looking for a blank 1 1 1 1 2 2 in this case, reason being if there is a month then further
calculation is going to take place, calculation or Month name should only come up if last month stage = 1.
@kishoresamineni wrote:
hi KurtBremser
Finally, i got the solution
You appear to have done a literal translation of the convoluted excel formula of nested IF type function calls into a convoluted series of IF/THEN statements.
I hope the code works for you but I am not sure how you will ever know or whether or not it works because you have been unable to explain what you are trying to do.
data R_MONTH;
SET r_t_format_month;
array mth(6) May June July August September October;
if mth(6) = 3 then
Month = " ";
else if mth(6) = 2 then
do;
do i = 1 to 5;
if mth(i) = 3 then
do; Month = i; leave; end;
end;
end;
else;
if mth(6) = 1 then
do;
do i = 1 to 5;
if mth(i) = 3 then
do;
Month = i; leave; end;
end;
if Month = " " then
do;
do i = 1 to 5;
if mth(i) = 2 then
do;
Month = i; leave; end;
end;
end;
end;
if Month NE " " then
NEW_MONTH = vname(mth(Month));
run;
@kishoresamineni If you want us to help you, then help us help you 🙂
You have to be specific about:
- What your initial data looks like
- The logic that you want to implement
- The desired result.
sample data
id Stage Date
22313 2 31MAY2019
22314 1 31MAY2019
22315 3 31MAY2019
22316 3 31MAY2019
22317 1 31MAY2019
22318 2 31MAY2019
22313 2 30JUN2019
22314 1 30JUN2019
22315 3 30JUN2019
22316 3 30JUN2019
22317 1 30JUN2019
22318 2 30JUN2019
22313 2 31JUL2019
22314 2 31JUL2019
22315 3 31JUL2019
22316 2 31JUL2019
22317 1 31JUL2019
22318 2 31JUL2019
22313 1 31AUG2019
22314 2 31AUG2019
22315 2 31AUG2019
22316 2 31AUG2019
22317 1 31AUG2019
22318 2 31AUG2019
22313 1 30SEP2019
22314 1 30SEP2019
22315 2 30SEP2019
22316 2 30SEP2019
22317 1 30SEP2019
22318 2 30SEP2019
22313 1 31OCT2019
22314 1 31OCT2019
22315 2 31OCT2019
22316 2 31OCT2019
22317 2 31OCT2019
22318 2 31OCT2019
Expeted Results:
ID May June July August September October Month
22313 2 2 2 1 1 1 May
22314 1 1 2 2 1 1 July
22315 3 3 3 2 2 2 May
22316 3 3 2 2 2 2 May
22317 1 1 1 1 1 2 <BLANK>
22318 2 2 2 2 2 2 <BLANK>
Ok. Please explain exactly how the month variable is calculated? Why is month=July for id=22314 and blank for id=22317?
Hi draycut,
this is the business rule they follow:
id=22314 is July because ID moved from 1 to 2
business rule: any ID within last six month moves from stage 1 to 2 or 3 then that Month name should be populated and till sep'19 if any ID in stage 1 and in OCT'19 if it moves to stage 2 or 3 then Month should be blank(ID=22317)
If you see from my program even after applying lag function Month condition is not applying for some scenarios
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.