Hi Everyone, I trying to figure out the following problem.
At the moment I have a dataset like this
Application_id Reaon_code Value
123 AB31AB45 £500
124 AB43RD49TY87 £640
125 RT87 £900
126 CD19RV29 £1000
What I want to get is separate the reason_code variable by taking the subset of this, each reason only has 4 character and combine 2 letters and 2 numbers, always
The dataset I want to get is the following
Application_id Reason_code Value
123 AB31 £500
123 AB45 £500
124 AB43 £640
124 RD49 £640
124 TY87 £640
145 RT87 £900
Hope this make sense.
2nd question, I want to create a flag showing
Application_id Reason_code Value Waterfall_reason Unique_Reason
123 AB31 £500 1 (as it his AB31 first) 0 (as it hits both AB31 and AB45)
123 AB45 £500 0 (as it hits AB31 first) 0 (as it hits both AB31 and AB45)
124 AB43 £640 1 (as it hits AB43 first) 0 (as it hits both AB43,RD49 and TY87)
124 RD49 £640 0 0
124 TY87 £640 0 0
145 RT87 £900 1 (as it hits RT87 first) 1 (as it ONLY Hit RT87)
Hope the questions make sense, I'm a new starter using SAS therefore I'm not familiar with complicated logic
Thanks for everyone in advance
Regards
data have;
input Application_id Reaon_code : $20. Value $;
cards;
123 AB31AB45 £500
124 AB43RD49TY87 £640
125 RT87 £900
126 CD19RV29 £1000
;
run;
data want;
set have;
do i=1 to length(Reaon_code) by 4;
Waterfall_reason=ifn(i=1,1,0);
Unique_Reason=ifn(length(Reaon_code)=4,1,0);
code=substr(Reaon_code,i,4);
output;
end;
drop i Reaon_code;
run;
Xia Keshan
data have;
input Application_id Reaon_code : $20. Value $;
cards;
123 AB31AB45 £500
124 AB43RD49TY87 £640
125 RT87 £900
126 CD19RV29 £1000
;
run;
data want;
set have;
do i=1 to length(Reaon_code) by 4;
Waterfall_reason=ifn(i=1,1,0);
Unique_Reason=ifn(length(Reaon_code)=4,1,0);
code=substr(Reaon_code,i,4);
output;
end;
drop i Reaon_code;
run;
Xia Keshan
Hi Keshan,
Just want to common on my last reply,
I just tried that after your code I can use
proc sort data=want nodup;
by applicaiton_id reaon_code Waterfall_reason Unique_Reason;
run;
It works that I can remove the duplicate record, however is there anyway I can convert the variable at beginning rather than later. The reason for this is it's better to have waterfall_hit with 1 at the top, while if I do proc sort, nodup then the result I got not necessary with waterfall_hit =1 on the top.
OU. That would be complicated .
What if reaon_code is RT87RT87 , what you gotta do with Unique_Reason ?
Hi all,
Thanks very much for this, really helpful and it works.
However I found another problem within my data,
regard the variable reason_code, sometimes it has duplicate records like reason_code='RV08RV08BY97', I need to convert to 'RV08BY97' first then do all the logic above,
Any ideas I can remove the duplicate records within Reason_code variable?
Many Thanks
By adding some Hash to Xia Keshan's code should fit your bill:
data have;
input Application_id Reaon_code : $20. Value $;
cards;
123 AB31AB45AB45 £500
124 AB43RD49TY87 £640
125 RT87RT87 £900
126 CD19RV29CD19 £1000
;
run;
data want;
set have;
do i=1 to length(Reaon_code) by 4;
if i=1 then
do;
declare hash h1(ordered:'a');
h1.definekey('i');
h1.definedata('code','waterfall_reason');
h1.definedone();
declare hash h2();
h2.definekey('code');
h2.definedone();
declare hiter hi('h1');
end;
Waterfall_reason=ifn(i=1,1,0);
code=substr(Reaon_code,i,4);
if h2.check() ne 0 then
do;
rc=h2.replace();
rc=h1.replace();
end;
end;
unique_reason=ifn(h1.num_items=1,1,0);
do rc=hi.first() by 0 while (rc=0);
output;
rc=hi.next();
end;
drop i rc;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.