BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
iscgonzalez
Obsidian | Level 7

Hello guys,

 

I'm kinda stuck here, hoping someone can give me some light...

 

I have a column with a string that represents status on transactions on a certain period (monthly), something like this:

 

data test;
input status $50.;
datalines;
02010103
01020301
;
run;

Each period is two characters lenght, then I have to split the string onto multiple columns, and repeat those columns 4 times so I have weeks instead of months) something like this:

 

w1 w2 w3 w4 w5 w6.....

--------------------------------

02  02 02  02 01  01 ...

01  01 01  01 02  02 ....

 

This is causing me a headache, because I can think in a way to split, but no way to make columns names be in order. Any idea?

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
data test;
input status $50.;
datalines;
02010103
01020301
;
run;


data want;
    set test;
    *CREATE ARRAY, 25 WEEKS BASED ON LENGTH 25 CHARS*4 = 100 MAX;
    array wk(*) $2. wk1-wk100;

    *determine the number of iterations;
    n_iter = length(status)/2;

    do i=1 to n_iter;
    
        *take the substr desired;
        Value = substr(status, (i-1)*2 + 1, 2);

        *assign to wk, could also write a small do loop here but not super efficient;
        wk((I-1)*4+1)=value;
        wk((I-1)*4+2) = value;
        wk((I-1)*4+3) = value;
        wk((I-1)*4+4) = value;


    end;
run;

View solution in original post

11 REPLIES 11
Reeza
Super User

What's the exact output you'd get from the two entries you have below?

 


@iscgonzalez wrote:

Hello guys,

 

I'm kinda stuck here, hoping someone can give me some light...

 

I have a column with a string that represents status on transactions on a certain period (monthly), something like this:

 

data test;
input status $50.;
datalines;
02010103
01020301
;
run;

Each period is two characters lenght, then I have to split the string onto multiple columns, and repeat those columns 4 times so I have weeks instead of months) something like this:

 

w1 w2 w3 w4 w5 w6.....

--------------------------------

02  02 02  02 01  01 ...

01  01 01  01 02  02 ....

 

This is causing me a headache, because I can think in a way to split, but no way to make columns names be in order. Any idea?

 


 

iscgonzalez
Obsidian | Level 7

I'm expecting something like this

 

w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13 w14 w15 w16
02 02 02 02 01 01 01 01 01 01 01 01 03 03 03 03
01 01 01 01 02 02 02 02 03 03 03 03 01 01 01 01

 

 

 

Reeza
Super User

So each value is always repeated 4 times?

Are you reading this data from a text file or are you starting from a SAS data set of sorts?


@iscgonzalez wrote:

I'm expecting something like this

 

w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 w11 w12 w13 w14 w15 w16
02 02 02 02 01 01 01 01 01 01 01 01 03 03 03 03
01 01 01 01 02 02 02 02 03 03 03 03 01 01 01 01

 

 

 


 

iscgonzalez
Obsidian | Level 7

Yes, each value has to be repeated 4 times. And I'm reading from a existing dataset.

In this case not every record has the same size, but all of them have even number of characters, so each period is represented by 2 characters. I think that is not he difficult part, because I can calculate the number of characters, divide them by 2 and get the number of periods and in that way try to limit the loop iterations.

 

The main objective would be to repeat each value (2 characters) 4 times.

novinosrin
Tourmaline | Level 20

i have answered below. Please take a look. Thank you!

novinosrin
Tourmaline | Level 20

something like this.....i'm afraid doesn't help. 

1. does all your string in every record have a length of 8 chars or perhaps the length an even number and are equal?

2. if yes, are you expecting 16 columns for each record?

3. are you reading a sas dataset or external file?

 

Reeza
Super User
data test;
input status $50.;
datalines;
02010103
01020301
;
run;


data want;
    set test;
    *CREATE ARRAY, 25 WEEKS BASED ON LENGTH 25 CHARS*4 = 100 MAX;
    array wk(*) $2. wk1-wk100;

    *determine the number of iterations;
    n_iter = length(status)/2;

    do i=1 to n_iter;
    
        *take the substr desired;
        Value = substr(status, (i-1)*2 + 1, 2);

        *assign to wk, could also write a small do loop here but not super efficient;
        wk((I-1)*4+1)=value;
        wk((I-1)*4+2) = value;
        wk((I-1)*4+3) = value;
        wk((I-1)*4+4) = value;


    end;
run;
iscgonzalez
Obsidian | Level 7

Thank you very much!!! That was exactly what I needed

novinosrin
Tourmaline | Level 20

will this help?

data test;
input status $50.;
datalines;
02010103
01020301
;
run;

data _have;
set test;
do _n_=1 to length(status) by 2;
 do j=1 to 4;
temp=substr(status, _n_, 2);
output;
end;
end;
keep status  temp;
run;

proc transpose data=_have out=want(drop=_name_);
by status notsorted;
var temp;
run;
ballardw
Super User

If those values relate to time in any way then you need to consider very much how your "week" is defined and the relationship to "month".

 

For instance Jan 2018 has 5 Mondays. So should that be 5 weeks or 4? If you are using Sundays as a week start then July 2018 has 5 Sundays.

 

Also does the length of status vary? This becomes a code maintenance problem if you process data today with a length of 8 and 4 periods( 02 01 01 03 for the first record) and in the future you have additional data for more periods ( 02 01 01 03 01 02 for example) your later data set would have more variables requiring you to modify all of your code to process though additional variables.

 

The length of your status variable implies that you might have as many as 25 of these two digit periods. You might find it easier in the long run to generate a "long" data set with a week indicator in a separate variable. (which would also solve the "how to generate names for 25*4 variables).

 

Something like this perhaps:

data test;
input status $50.;
/* assumes whatever the 02 and such represent are always 2 characters*/
periods= length(status)/2;
week=0;
length code $ 2.;
do i=1 to periods;
   code = substr(status,2*i-1,2);
   do j=1 to 4;
      week+1;
      output;
   end;
end;
drop periods i j;
datalines;
02010103
01020301
;
run;
Ksharp
Super User
data test;
input status $50.;
datalines;
02010103
01020301
;
run;

data temp;
 set test;
 n+1;
 temp=prxchange('s/(\d\d)/$1 /',-1,status);
 do i=1 to countw(temp);
  value=scan(temp,i);
  do j=1 to 4;
    output;
  end;
 end;
 keep n value;
run;
proc transpose data=temp out=want prefix=w;
 by n;
 var value;
run;
proc print noobs;run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 2609 views
  • 3 likes
  • 5 in conversation