Hi Everyone, I trying to figure out the following problem.
At the moment I have a dataset like this
Application_id Reaon_code Value
123 AB31AB45 £500
124 AB43RD49TY87 £640
125 RT87 £900
126 CD19RV29 £1000
What I want to get is separate the reason_code variable by taking the subset of this, each reason only has 4 character and combine 2 letters and 2 numbers, always
The dataset I want to get is the following
Application_id Reason_code Value
123 AB31 £500
123 AB45 £500
124 AB43 £640
124 RD49 £640
124 TY87 £640
145 RT87 £900
Hope this make sense.
2nd question, I want to create a flag showing
Application_id Reason_code Value Waterfall_reason Unique_Reason
123 AB31 £500 1 (as it his AB31 first) 0 (as it hits both AB31 and AB45)
123 AB45 £500 0 (as it hits AB31 first) 0 (as it hits both AB31 and AB45)
124 AB43 £640 1 (as it hits AB43 first) 0 (as it hits both AB43,RD49 and TY87)
124 RD49 £640 0 0
124 TY87 £640 0 0
145 RT87 £900 1 (as it hits RT87 first) 1 (as it ONLY Hit RT87)
Hope the questions make sense, I'm a new starter using SAS therefore I'm not familiar with complicated logic
Thanks for everyone in advance
Regards
Please try
data want;
set have;
ExpressionID = prxparse('/\w\w\d\d/');
start = 1;
stop = length(Reaon_code);
call prxnext(ExpressionID, start, stop, Reaon_code, position, length);
do while (position > 0);
found = substr(Reaon_code, position, length);
call prxnext(ExpressionID, start, stop, Reaon_code, position, length);
output;
end;
run;
data want_;
set want;
by Application_id ;
if first.Application_id then Waterfall_reason =1 ;
else Waterfall_reason =0;
if first.Application_id and last.Application_id then Unique_Reason =1 ;
else Unique_Reason =0;
run;
Thanks,
jag
Hi Jag,
Thanks very much for your quick response and your code works, really appreciated
However I have another problem now, just found that the variable Reason_code always have some duplicate results, i.e
Reason_code='RV08RV08RI34RI34RI34RU87', in this case I need to convert this variable to 'RV08RI34RU87' then using your logic.
Do you know how to get rid off the duplicate records please.
I have tried using proc sort data=want nodup;
by appplication_id reaon_code waterfall_reason unique_reason;
run;
It works but this is after your code, I want to change it at beginning. Any thoughts on this?
Thansk very much in advance
I inserted the following highlighted code in the program to remove the duplicates without the proc sort procedure. Hope this helps you.
data want_;
set want;
by Application_id found notsorted;
if first.found;
if first.Application_id then Waterfall_reason =1 ;
else Waterfall_reason =0;
if first.Application_id and last.Application_id then Unique_Reason =1 ;
else Unique_Reason =0;
run;
Thanks,
Jag
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.