Solved
Contributor
Posts: 35

# splitting variable into multiple variables

I have following Data

data a;

input x \$ 30.;

cards;

28,16c2b8c4l8n

28,16cb8c4l8n

28,cbc4l24n

;

run;

here comma(,)=no activity c means contact, l =lunch, b=break, n=non contact,

I need result like this;

no activity contact break lunch nonactivity

7              6           0.5      1       2.

7              6            0.25    1       2

7              6.5         0.25    1        0

can anyone solve this??

Accepted Solutions
Solution
‎07-15-2015 07:01 PM
Super Contributor
Posts: 275

## Re: splitting variable into multiple variables

data a;

input x \$ 30.;

cards;

28,16c2b8c4l8n

28,16cb8c4l8n

28,cbc4l24n

;

run;

data want;

set a;

_x=translate(prxchange('s/(?<=\D)([a-z])/1\$1/i',-1,x),' ','cbln');

activity=scan(_x,1)/4;

contract=(scan(_x,2)+scan(_x,4))/4;

break=(scan(_x,3))/4;

lunch=(scan(_x,5))/4;

noactivity=(scan(_x,6))/4;

drop x _x;

run;

All Replies
Super User
Posts: 9,599

## Re: splitting variable into multiple variables

Sorry, please clarify.  How are the output variables derived, I don't see 7 commas for instance.

Frequent Contributor
Posts: 78

## Re: splitting variable into multiple variables

Hi,

I find the question a little unclear too, because the desired results don't seem to match the data in the rows you provided.  Nonetheless, something like this might work:

data a;

input x \$ 30.;

cards;

28,16c2b8c4l8n

28,16cb8c4l8n

28,cbc4l24n

;

run;

data b (drop = position letter temp i);

set a;

original_record = _n_;

do i = 1 to 10;  /* guessing max here, can use other types of loops or get out of loop once temp is empty */

if i = 1 then temp = x;

call missing(no_activity, contact, lunch, break, nonactivitiy);

position = findc(temp,',clbn');  /* find location of , c l b n */

*put temp= position=;

if position ne 0 then

do;

letter = substr(temp, position, 1);  /* which letter is it? */

*put letter=;                                                              /* output 0 if no number before the letter */

/* should correct for automatic conversion */

if letter = ','      then do; if position = 1 then no_activity = 0; else no_activity = substr(temp, 1, position-1); output; end;

else if letter = 'c' then do; if position = 1 then contact = 0; else contact = substr(temp, 1, position-1);        output; end;

else if letter = 'l' then do; if position = 1 then lunch = 0; else lunch = substr(temp, 1, position-1);          output; end;

else if letter = 'b' then do; if position = 1 then break = 0; else break = substr(temp, 1, position-1);          output; end;

else if letter = 'n' then do; if position = 1 then nonactivity = 0; else nonactivity = substr(temp, 1, position-1);    output; end;

temp = substr(temp, position+1);  /*  re-assign temp as the rest of the string */

end;

end;

run;

proc sort data = b; by original_record; run;

data c (keep = original_record total ;

set b;

by original_record;

retain  total_no_activity

total_contact

total_lunch

total_break

total_nonactivity ;

if first.original_record then

do;

total_no_activity = 0;     /* set to zero at the beginning of each group */

total_contact = 0;

total_lunch = 0;

total_break = 0;

total_nonactivity = 0;

end;

if not missing(no_activity) then total_no_activity = total_no_activity + no_activity;  /* add them up */

if not missing(contact) then total_contact = total_contact + contact;

if not missing(lunch) then total_lunch = total_lunch + lunch;

if not missing(break) then total_break = total_break + break;

if not missing(nonactivity) then total_nonactivity = total_nonactivity + nonactivity;

if last.original_record then output;  /* only keep totals */

run;

proc print data = c; run;

Output:

original_    total_no_     total_    total_    total_       total_

Obs      record      activity    contact     lunch     break    nonactivity

1         1            28          24         4         2            8

2         2            28          24         4         0            8

3         3            28           0         4         0           24

Contributor
Posts: 35

## Re: splitting variable into multiple variables

Hi

Sorry I dint give full information.

here 28,16c2b8c4l8n means 28 (commas)no activity, 16contacts, 2breaks, 8contacts, 4 lunch, 8 noncontact

this is per hour.  so first we need  like this 28 (commas)no activity,16contacts+8contacts, 2 breaks, 4lunch, 8noncontact

finally i want it for 15 minutes. so it comes like this 7  6  0.25  1 2. like this we need to do for  remaingn observations also.

Thank you

Solution
‎07-15-2015 07:01 PM
Super Contributor
Posts: 275

## Re: splitting variable into multiple variables

data a;

input x \$ 30.;

cards;

28,16c2b8c4l8n

28,16cb8c4l8n

28,cbc4l24n

;

run;

data want;

set a;

_x=translate(prxchange('s/(?<=\D)([a-z])/1\$1/i',-1,x),' ','cbln');

activity=scan(_x,1)/4;

contract=(scan(_x,2)+scan(_x,4))/4;

break=(scan(_x,3))/4;

lunch=(scan(_x,5))/4;

noactivity=(scan(_x,6))/4;

drop x _x;

run;

Contributor
Posts: 35

## Re: splitting variable into multiple variables

Hi

if we keep 24c instead of 24n in third obsevation i mean if data is like this

data a;

input x \$ 30.;

cards;

28,16c2b8c4l8n

28,16cb8c4l8n

28,cbc4l24c

;

run;

how can we solve?

Contributor
Posts: 35

## Re: splitting variable into multiple variables

excellent Slchen. Thank you so much

Contributor
Posts: 35

## Re: splitting variable into multiple variables

data a;

input x \$ 30.;

cards;

28,16c2b8c4l8n

28,16cb8c4l8n

28,cbc4l24c

;

run;

if we keep c instead of n in third "obsevation" then how can we solve?

thank you

🔒 This topic is solved and locked.