Desktop productivity for business analysts and programmers

Split single row into multiple rows by taking the subset of the variables

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Split single row into multiple rows by taking the subset of the variables

Hi Everyone, I trying to figure out the following problem.

At the moment I have a dataset like this

Application_id                Reaon_code            Value

123                               AB31AB45                £500

124                              AB43RD49TY87        £640

125                              RT87                        £900

126                              CD19RV29               £1000

What I want to get is separate the reason_code variable by taking the subset of this, each reason only has 4 character and combine 2 letters and 2 numbers, always

The dataset I want to get is the following

Application_id             Reason_code               Value

123                             AB31                          £500

123                             AB45                          £500

124                             AB43                         £640

124                             RD49                         £640

124                            TY87                           £640

145                            RT87                          £900

Hope this make sense.

2nd question, I want to create a flag showing

Application_id             Reason_code               Value           Waterfall_reason                                                           Unique_Reason

123                             AB31                          £500                       1 (as it his AB31 first)                              0 (as it hits both AB31 and AB45)

123                             AB45                          £500                       0 (as it hits AB31 first)                             0 (as it hits both AB31 and AB45)

124                             AB43                         £640                        1 (as it hits AB43 first)                             0 (as it hits both AB43,RD49 and TY87)

124                             RD49                         £640                        0                                                            0

124                            TY87                           £640                        0                                                            0

145                            RT87                          £900                        1 (as it hits RT87 first)                              1 (as it ONLY Hit RT87)

Hope the questions make sense, I'm a new starter using SAS therefore I'm not familiar with complicated logic

Thanks for everyone in advance

Regards


Accepted Solutions
Solution
‎06-14-2015 09:36 AM
Grand Advisor
Posts: 9,571

Re: Split single row into multiple rows by taking the subset of the variables

Code: Program

data have;
input Application_id   Reaon_code   : $20.   Value $;
cards;
123 AB31AB45 £500
124 AB43RD49TY87 £640
125 RT87 £900
126 CD19RV29 £1000
;
run;
data want;
set have;
do i=1 to length(Reaon_code) by 4;
   Waterfall_reason=ifn(i=1,1,0);
   Unique_Reason=ifn(length(Reaon_code)=4,1,0);
   code=substr(Reaon_code,i,4);
   output;
end;
drop i Reaon_code;
run;

Xia Keshan

View solution in original post


All Replies
Solution
‎06-14-2015 09:36 AM
Grand Advisor
Posts: 9,571

Re: Split single row into multiple rows by taking the subset of the variables

Code: Program

data have;
input Application_id   Reaon_code   : $20.   Value $;
cards;
123 AB31AB45 £500
124 AB43RD49TY87 £640
125 RT87 £900
126 CD19RV29 £1000
;
run;
data want;
set have;
do i=1 to length(Reaon_code) by 4;
   Waterfall_reason=ifn(i=1,1,0);
   Unique_Reason=ifn(length(Reaon_code)=4,1,0);
   code=substr(Reaon_code,i,4);
   output;
end;
drop i Reaon_code;
run;

Xia Keshan

Occasional Contributor
Posts: 5

Re: Split single row into multiple rows by taking the subset of the variables

Hi Keshan,

Just want to common on my last reply,

I just tried that after your code I can use

proc sort data=want nodup;

by applicaiton_id reaon_code Waterfall_reason Unique_Reason;

run;

It works that I can remove the duplicate record, however is there anyway I can convert the variable at beginning rather than later. The reason for this is it's better to have waterfall_hit with 1 at the top, while if I do proc sort, nodup then the result I got not necessary with waterfall_hit =1 on the top.

Duplicate.PNG

Grand Advisor
Posts: 9,571

Re: Split single row into multiple rows by taking the subset of the variables

OU. That would be complicated .

What if reaon_code is RT87RT87 , what you gotta do with Unique_Reason ?

Occasional Contributor
Posts: 5

Re: Split single row into multiple rows by taking the subset of the variables

Hi all,

Thanks very much for this, really helpful and it works. 

However I found another problem within my data,

regard the variable reason_code, sometimes it has duplicate records like reason_code='RV08RV08BY97', I need to convert to 'RV08BY97' first then do all the logic above,

Any ideas I can remove the duplicate records within Reason_code variable?

Many Thanks

Respected Advisor
Posts: 3,124

Re: Split single row into multiple rows by taking the subset of the variables

By adding some Hash to Xia Keshan's code should fit your bill:

data have;

     input Application_id   Reaon_code   : $20.   Value $;

     cards;

123 AB31AB45AB45 £500

124 AB43RD49TY87 £640

125 RT87RT87 £900

126 CD19RV29CD19 £1000

;

run;

data want;

     set have;

     do i=1 to length(Reaon_code) by 4;

           if i=1 then

                do;

                     declare hash h1(ordered:'a');

                     h1.definekey('i');

                     h1.definedata('code','waterfall_reason');

                     h1.definedone();

                     declare hash h2();

                     h2.definekey('code');

                     h2.definedone();

                     declare hiter hi('h1');

                end;

           Waterfall_reason=ifn(i=1,1,0);

           code=substr(Reaon_code,i,4);

           if h2.check() ne 0 then

                do;

                     rc=h2.replace();

                     rc=h1.replace();

                end;

     end;

     unique_reason=ifn(h1.num_items=1,1,0);

     do rc=hi.first() by 0 while (rc=0);

           output;

           rc=hi.next();

     end;

     drop i rc;

run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 454 views
  • 1 like
  • 3 in conversation