DATA Step, Macro, Functions and more

Changing variable format

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 128
Accepted Solution

Changing variable format

drs 
a-b-c 
c-d-e-f

I have the following variable: drs. I want to create a new variable for each component of that variable. Here is the output I am looking at : 

drs1 drs2 drs3 drs4
a  b c 
c d e f

My next step after being able to work on the above data is to transform the modified format into the original one

drs1 drs2 drs3 drs4
a  b d
c d e e

output 

drs
a-b-d
c-d-e-e

Accepted Solutions
Solution
‎12-06-2016 09:14 AM
Super User
Posts: 19,855

Re: Changing variable format

Posted in reply to lillymaginta

This uses several functions that you should know how to use:

 

SCAN - allows you to access each item in the list

COUNTW - counts the # of components

CATX - allows you to recombine the components, using a hyphen as a separator.

 

These are common functions that you should spend some time learning, they're very useful.

 

If you know you have a max of 13 this is relatively straightforward:

 

data have;
input drs $26.;
cards;
a-b-c-d
a-b-e-e
a-d-e-a-c-e
a-b-c-d-e-f-g-e-a
a-a
a-b 
a-b-c-d-e-f
;
run;

data want;
set have;

array dr(13) $ dr1-dr13;
n_drs = countw(drs, "-");

do i=1 to n_drs;
dr(i)=scan(drs, i, "-");
end;

combined = catx("-", of dr(*));
run; 

View solution in original post


All Replies
Super User
Posts: 19,855

Re: Changing variable format

Posted in reply to lillymaginta

Is there a typo in your last two examples (the last e should be an f)?

 

Can you make any assumptions about the length of the variables? Is it always two componenents? Do you have 3 in each of the first and 4 in the second or is it dynamic?

 

I think it may be better if you posted a slightly larger sample that better illustrates some of the issues with your data. Otherwise, I suspect someone will provide an answer, and you'll respond with a variation and it's an inefficient circle. 

 

 

Frequent Contributor
Posts: 128

Re: Changing variable format

Reeza,

Thank you for the reply. I changed the f to an e to indicate I worked on the data so it is not similar to the first one. The variable length can be up to 13 components: a-b-c-d-e-a-b-c-d-e-a-b-c

drs 
a-b-c-d
a-b-e-e
a-d-e-a-c-e
a-b-c-d-e-f-g-e-a
a-a
a-b 
a-b-c-d-e-f 

I just want to be able to convert it to signle variables format for each of the componenet and be able to create the same variable again.  

Solution
‎12-06-2016 09:14 AM
Super User
Posts: 19,855

Re: Changing variable format

Posted in reply to lillymaginta

This uses several functions that you should know how to use:

 

SCAN - allows you to access each item in the list

COUNTW - counts the # of components

CATX - allows you to recombine the components, using a hyphen as a separator.

 

These are common functions that you should spend some time learning, they're very useful.

 

If you know you have a max of 13 this is relatively straightforward:

 

data have;
input drs $26.;
cards;
a-b-c-d
a-b-e-e
a-d-e-a-c-e
a-b-c-d-e-f-g-e-a
a-a
a-b 
a-b-c-d-e-f
;
run;

data want;
set have;

array dr(13) $ dr1-dr13;
n_drs = countw(drs, "-");

do i=1 to n_drs;
dr(i)=scan(drs, i, "-");
end;

combined = catx("-", of dr(*));
run; 
Frequent Contributor
Posts: 128

Re: Changing variable format

Thank you Reeza for the solution and suggestions! 

Super User
Super User
Posts: 7,977

Re: Changing variable format

Posted in reply to lillymaginta

Your post is unclear to me.  If you want each block of characters separated by a space out to an obs then:

data want;
  length res $200;
  do i=1 to countw(drx," ");
    res=scan(drs,i," ");
    output;
  end;
run;

I don't see why all to splitting up, then putting back together.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 212 views
  • 1 like
  • 3 in conversation