Hi Everyone, I trying to figure out the following problem.
At the moment I have a dataset like this
Application_id Reaon_code Value
123 AB31AB45 £500
124 AB43RD49TY87 £640
125 RT87 £900
126 CD19RV29 £1000
What I want to get is separate the reason_code variable by taking the subset of this, each reason only has 4 character and combine 2 letters and 2 numbers, always
The dataset I want to get is the following
Application_id Reason_code Value
123 AB31 £500
123 AB45 £500
124 AB43 £640
124 RD49 £640
124 TY87 £640
145 RT87 £900
Hope this make sense.
2nd question, I want to create a flag showing
Application_id Reason_code Value Waterfall_reason Unique_Reason
123 AB31 £500 1 (as it his AB31 first) 0 (as it hits both AB31 and AB45)
123 AB45 £500 0 (as it hits AB31 first) 0 (as it hits both AB31 and AB45)
124 AB43 £640 1 (as it hits AB43 first) 0 (as it hits both AB43,RD49 and TY87)
124 RD49 £640 0 0
124 TY87 £640 0 0
145 RT87 £900 1 (as it hits RT87 first) 1 (as it ONLY Hit RT87)
Hope the questions make sense, I'm a new starter using SAS therefore I'm not familiar with complicated logic
Thanks for everyone in advance
Regards
Please try
data want;
set have;
ExpressionID = prxparse('/\w\w\d\d/');
start = 1;
stop = length(Reaon_code);
call prxnext(ExpressionID, start, stop, Reaon_code, position, length);
do while (position > 0);
found = substr(Reaon_code, position, length);
call prxnext(ExpressionID, start, stop, Reaon_code, position, length);
output;
end;
run;
data want_;
set want;
by Application_id ;
if first.Application_id then Waterfall_reason =1 ;
else Waterfall_reason =0;
if first.Application_id and last.Application_id then Unique_Reason =1 ;
else Unique_Reason =0;
run;
Thanks,
jag
Hi Jag,
Thanks very much for your quick response and your code works, really appreciated
However I have another problem now, just found that the variable Reason_code always have some duplicate results, i.e
Reason_code='RV08RV08RI34RI34RI34RU87', in this case I need to convert this variable to 'RV08RI34RU87' then using your logic.
Do you know how to get rid off the duplicate records please.
I have tried using proc sort data=want nodup;
by appplication_id reaon_code waterfall_reason unique_reason;
run;
It works but this is after your code, I want to change it at beginning. Any thoughts on this?
Thansk very much in advance
I inserted the following highlighted code in the program to remove the duplicates without the proc sort procedure. Hope this helps you.
data want_;
set want;
by Application_id found notsorted;
if first.found;
if first.Application_id then Waterfall_reason =1 ;
else Waterfall_reason =0;
if first.Application_id and last.Application_id then Unique_Reason =1 ;
else Unique_Reason =0;
run;
Thanks,
Jag
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.