Hi Hai.Kuo,
I got it working using following code:
data have;
input id $ dt yymmdd10. status $;
format dt yymmdd10.;
cards;
1 20141203 D
1 20141210 D
1 20141210 A
1 20141210 S
1 20150110 A
1 20150118 S
2 20150104 A
2 20150115 D
2 20150118 S
2 20150118 D
2 20150118 A
3 20150210 S
3 20150213 D
3 20150223 D
3 20150223 S
;
data have;
set have;
mnth = substr(put(dt,yymmddn8.),1,6);
if status = 'A' then seq = 1;
else if status = 'S' then seq = 2;
else if status = 'D' then seq = 3;
run;
proc sort data = have;
by id mnth descending seq;
run;
data want;
set have;
if last.mnth;
by id mnth descending seq;
keep id mnth status;
run;
Output was as follows:
ID mnth Status
1 201412 A
1 201501 A
2 201501 A
3 201502 S
Thank you for getting back to me. So from reading your code, what I have gathered is that status are evaluated on monthly basis, you don't really care what status happens on which day (therefore, using Last status is kinda misleading, and discussing multiple status on the same day is meaningless) rather, you only want to know if there is A, S or D in certain month, and output A if there is A, else output S if there is S but no A, output D there is D with no A/D. Let me present you more extreme example that your code does:
data have;
input id $ dt yymmdd10. status $;
format dt yymmdd10.;
cards;
1 20141203 A
1 20141210 D
1 20141210 D
1 20141210 D
1 20150110 A
1 20150118 S
2 20150104 A
2 20150115 D
2 20150118 D
2 20150118 D
2 20150118 D
3 20150210 A
3 20150213 S
3 20150223 S
3 20150223 S
;
data have;
set have;
mnth = substr(put(dt,yymmddn8.),1,6);
if status = 'A' then seq = 1;
else if status = 'S' then seq = 2;
else if status = 'D' then seq = 3;
run;
proc sort data = have;
by id mnth descending seq;
run;
data want;
set have;
if last.mnth;
by id mnth descending seq;
keep id mnth status;
run;
And this is what you are getting:
1 | 1 | A | 201412 |
2 | 1 | A | 201501 |
3 | 2 | A | 201501 |
4 | 3 | A | 201502 |
Meaning as long as there is a 'A' occurred within a month, regardless which day of the month, first day or last day, regardless whether there are 'S' or 'D', your code will ALWAYS output it. Make sure this is what you want.
And if indeed this is what you need, you can also simply resort to a SQL solution:
proc sql;
select id, put(dt,yymon7.) as yymon, case when sum(status='A')>0 then 'A'
when sum(status='S')>0 then 'S' when sum(status='D')>0 then 'D' end as status
from have
group by id, calculated yymon;
quit;
Thanks Haikuo for pointing out some mistakes I was making in my code. I do need sequence in my code if there are multiple statuses on same day(last day of the month).
Below is the code base Xia's code with minor changes which works.
data have;
input id $ dt : yymmdd10. status $;
format dt yymmdd10.;
cards;
1 20141203 D
1 20141210 D
1 20141210 A
1 20141210 S
1 20150110 A
1 20150118 S
2 20150104 A
2 20150115 D
2 20150118 S
2 20150118 D
2 20150118 A
3 20150210 A
3 20150213 D
3 20150223 D
3 20150223 S
;
data have;
set have;
year=year(dt);month=month(dt);
if status = 'A' then seq = 1;
else if status = 'S' then seq = 2;
else if status = 'D' then seq = 3;
run;
proc sort data=have;by id year month dt descending seq;run;
data want;
set have;
by id year month dt descending seq;
if last.month;
format dt monyy7.;
drop year month seq;
run;
proc print data = want noobs;
run;
Output as follows:
ID dt Status
1 DEC2014 A
1 JAN2015 S
2 JAN2015 A
3 FEB2015 S
Hi pp2014,
I guess this will help you,
data have;
input id $ dt yymmdd10. status $;
cards;
1 20141203 D
1 20141210 D
1 20141210 A
1 20141210 S
2 20150104 A
2 20150115 D
2 20150118 S
2 20150118 D
2 20150118 A
3 20150210 S
3 20150213 D
3 20150223 D
3 20150223 S
;
run;
data int1;
set have;
if status eq 'D' then temp1=1;
if status eq 'S' then temp1=2;
if status eq 'A' then temp1=3;
run;
proc sort
data=int1;
by id dt temp1;
run;
data int2;
set int1;
if last.id then temp2=1;
by id;
run;
data want;
format dt monyy7.;
set int2;
where temp2=1;
drop temp1 temp2;
run;
Regards,
Yogesh
Thanks lot everybody for help...
Have you got your correct answer? if yes, please mark it. If no, please explain why. BTW, I have raised a question in order to help you, still waiting for your response.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.