Solved: Split single row into multiple rows by taking the subset of the variab...

skybook · Posted 06-13-2015 09:27 PM

Hi Everyone, I trying to figure out the following problem.

At the moment I have a dataset like this

Application_id Reaon_code Value

123 AB31AB45 £500

124 AB43RD49TY87 £640

125 RT87 £900

126 CD19RV29 £1000

What I want to get is separate the reason_code variable by taking the subset of this, each reason only has 4 character and combine 2 letters and 2 numbers, always

The dataset I want to get is the following

Application_id Reason_code Value

123 AB31 £500

123 AB45 £500

124 AB43 £640

124 RD49 £640

124 TY87 £640

145 RT87 £900

Hope this make sense.

2nd question, I want to create a flag showing

Application_id Reason_code Value Waterfall_reason Unique_Reason

123 AB31 £500 1 (as it his AB31 first) 0 (as it hits both AB31 and AB45)

123 AB45 £500 0 (as it hits AB31 first) 0 (as it hits both AB31 and AB45)

124 AB43 £640 1 (as it hits AB43 first) 0 (as it hits both AB43,RD49 and TY87)

124 RD49 £640 0 0

124 TY87 £640 0 0

145 RT87 £900 1 (as it hits RT87 first) 1 (as it ONLY Hit RT87)

Hope the questions make sense, I'm a new starter using SAS therefore I'm not familiar with complicated logic

Thanks for everyone in advance

Regards

Ksharp · Posted 06-14-2015 09:36 AM

Code: Program

data have;
input Application_id   Reaon_code   : $20.   Value $;
cards;
123 AB31AB45 £500
124 AB43RD49TY87 £640
125 RT87 £900
126 CD19RV29 £1000
;
run;
data want;
 set have;
 do i=1 to length(Reaon_code) by 4;
   Waterfall_reason=ifn(i=1,1,0);
   Unique_Reason=ifn(length(Reaon_code)=4,1,0);
   code=substr(Reaon_code,i,4);
   output;
 end; 
 drop i Reaon_code;
run;

Xia Keshan

View solution in original post

Ksharp · Posted 06-14-2015 09:36 AM

Code: Program

data have;
input Application_id   Reaon_code   : $20.   Value $;
cards;
123 AB31AB45 £500
124 AB43RD49TY87 £640
125 RT87 £900
126 CD19RV29 £1000
;
run;
data want;
 set have;
 do i=1 to length(Reaon_code) by 4;
   Waterfall_reason=ifn(i=1,1,0);
   Unique_Reason=ifn(length(Reaon_code)=4,1,0);
   code=substr(Reaon_code,i,4);
   output;
 end; 
 drop i Reaon_code;
run;

Xia Keshan

skybook · Posted 06-14-2015 04:35 PM

Hi Keshan,

Just want to common on my last reply,

I just tried that after your code I can use

proc sort data=want nodup;

by applicaiton_id reaon_code Waterfall_reason Unique_Reason;

run;

It works that I can remove the duplicate record, however is there anyway I can convert the variable at beginning rather than later. The reason for this is it's better to have waterfall_hit with 1 at the top, while if I do proc sort, nodup then the result I got not necessary with waterfall_hit =1 on the top.

Ksharp · Posted 06-15-2015 09:26 AM

OU. That would be complicated .

What if reaon_code is RT87RT87 , what you gotta do with Unique_Reason ?

skybook · Posted 06-14-2015 04:15 PM

Hi all,

Thanks very much for this, really helpful and it works.

However I found another problem within my data,

regard the variable reason_code, sometimes it has duplicate records like reason_code='RV08RV08BY97', I need to convert to 'RV08BY97' first then do all the logic above,

Any ideas I can remove the duplicate records within Reason_code variable?

Many Thanks

Haikuo · Posted 06-15-2015 11:45 AM

By adding some Hash to Xia Keshan's code should fit your bill:

data have;

input Application_id Reaon_code : $20. Value $;

cards;

123 AB31AB45AB45 £500

124 AB43RD49TY87 £640

125 RT87RT87 £900

126 CD19RV29CD19 £1000

;

run;

data want;

set have;

do i=1 to length(Reaon_code) by 4;

if i=1 then

do;

declare hash h1(ordered:'a');

h1.definekey('i');

h1.definedata('code','waterfall_reason');

h1.definedone();

declare hash h2();

h2.definekey('code');

h2.definedone();

declare hiter hi('h1');

end;

Waterfall_reason=ifn(i=1,1,0);

code=substr(Reaon_code,i,4);

if h2.check() ne 0 then

do;

rc=h2.replace();

rc=h1.replace();

end;

unique_reason=ifn(h1.num_items=1,1,0);

do rc=hi.first() by 0 while (rc=0);

output;

rc=hi.next();

end;

drop i rc;

run;

Split single row into multiple rows by taking the subset of the variables

Re: Split single row into multiple rows by taking the subset of the variables

Code: Program

Re: Split single row into multiple rows by taking the subset of the variables

Code: Program

Re: Split single row into multiple rows by taking the subset of the variables

Re: Split single row into multiple rows by taking the subset of the variables

Re: Split single row into multiple rows by taking the subset of the variables

Re: Split single row into multiple rows by taking the subset of the variables

Registration is open

Split single row into multiple rows by taking the subset of the variables

Re: Split single row into multiple rows by taking the subset of the variables

Code: Program

Re: Split single row into multiple rows by taking the subset of the variables

Code: Program

Re: Split single row into multiple rows by taking the subset of the variables

Re: Split single row into multiple rows by taking the subset of the variables

Re: Split single row into multiple rows by taking the subset of the variables

Re: Split single row into multiple rows by taking the subset of the variables

Registration is open

SAS Training: Just a Click Away