Desktop productivity for business analysts and programmers

Split single row into multiple rows by taking the subset of the variables

Reply
Occasional Contributor
Posts: 5

Split single row into multiple rows by taking the subset of the variables

Hi Everyone, I trying to figure out the following problem.

At the moment I have a dataset like this

Application_id                Reaon_code            Value

123                               AB31AB45                £500

124                              AB43RD49TY87        £640

125                              RT87                        £900

126                              CD19RV29               £1000

What I want to get is separate the reason_code variable by taking the subset of this, each reason only has 4 character and combine 2 letters and 2 numbers, always

The dataset I want to get is the following

Application_id             Reason_code               Value

123                             AB31                          £500

123                             AB45                          £500

124                             AB43                         £640

124                             RD49                         £640

124                            TY87                           £640

145                            RT87                          £900

Hope this make sense.

2nd question, I want to create a flag showing

Application_id             Reason_code               Value           Waterfall_reason                                                           Unique_Reason

123                             AB31                          £500                       1 (as it his AB31 first)                              0 (as it hits both AB31 and AB45)

123                             AB45                          £500                       0 (as it hits AB31 first)                             0 (as it hits both AB31 and AB45)

124                             AB43                         £640                        1 (as it hits AB43 first)                             0 (as it hits both AB43,RD49 and TY87)

124                             RD49                         £640                        0                                                            0

124                            TY87                           £640                        0                                                            0

145                            RT87                          £900                        1 (as it hits RT87 first)                              1 (as it ONLY Hit RT87)

Hope the questions make sense, I'm a new starter using SAS therefore I'm not familiar with complicated logic

Thanks for everyone in advance

Regards

Super User
Posts: 1,117

Re: Split single row into multiple rows by taking the subset of the variables

Please try

data want;

set have;

   ExpressionID = prxparse('/\w\w\d\d/');

   start = 1;

   stop = length(Reaon_code);

   call prxnext(ExpressionID, start, stop, Reaon_code, position, length);

      do while (position > 0);

         found = substr(Reaon_code, position, length);

         call prxnext(ExpressionID, start, stop, Reaon_code, position, length);

         output;

      end;

run;

data want_;

set want;

by     Application_id     ;

if first.Application_id then  Waterfall_reason =1 ;

else  Waterfall_reason =0;

if first.Application_id and last.Application_id then  Unique_Reason =1 ;

else  Unique_Reason =0;

run;

Thanks,

jag

Thanks,
Jag
Occasional Contributor
Posts: 5

Re: Split single row into multiple rows by taking the subset of the variables

Hi Jag,

Thanks very much for your quick response and your code works, really appreciated Smiley Happy

However I have another problem now, just found that the variable  Reason_code always have some duplicate results, i.e

Reason_code='RV08RV08RI34RI34RI34RU87', in this case I need to convert this variable to 'RV08RI34RU87' then using your logic.

Do you know how to get rid off the duplicate records please.

I have tried using proc sort data=want nodup;

by appplication_id reaon_code waterfall_reason unique_reason;

run;

It works but this is after your code, I want to change it at beginning. Any thoughts on this?

Thansk very much in advance

Super User
Posts: 1,117

Re: Split single row into multiple rows by taking the subset of the variables

I inserted the following highlighted code in the program to remove the duplicates without the proc sort procedure. Hope this helps you.

data want_;

set want;

by     Application_id     found notsorted;

if first.found;

if first.Application_id then  Waterfall_reason =1 ;

else  Waterfall_reason =0;

if first.Application_id and last.Application_id then  Unique_Reason =1 ;

else  Unique_Reason =0;

run;

Thanks,

Jag

Thanks,
Jag
Ask a Question
Discussion stats
  • 3 replies
  • 357 views
  • 3 likes
  • 2 in conversation