Splitting string variable into Multiple Column

Accepted Solution Solved
Reply
Contributor
Posts: 21
Accepted Solution

Splitting string variable into Multiple Column

[ Edited ]

Hi SAS Community,

 

I have a problem with my dataset. My dataset is presented as follows:

VAR

--------

1300456#GFA1#2000-11-13#GTO#1455667

1300457#GFA2#2000-11-14#GTO#1455721

1305515#GFA10#2000-11-20#EXP#5154155

 

And what I want to achieve is:

 

VAR1        VAR2      VAR3                  VAR4       VAR5

1300456    GFA1     2000-11-13         GTO        1455667

1300457    GFA2     2000-11-14         GTO         1455721

1305515    GFA10   2000-11-20          EXP          5154155

 

 

Basically, I just want to separate/ split one variable into multiple variables. I have tried the following code:

Numerous ways to do what you want.  e.g.:

 

data want (drop=_: i);

  set have (rename=(var=_var));

  array var(5);

  i=1;

  do until (scan(_var,i,"#") eq "");

    var(i)=scan(_var,i,"#");

    i+1;

  end;

run;

 

(Source: https://communities.sas.com/t5/SAS-Procedures/Splitting-Text-into-different-variables-columns/td-p/5...

 

However, this syntax failed to take into account the string text in VAR2, the calendar date form in VAR3, and another string text in VAR4 (i.e., it captures only the numerical data). Can anyone help me with this? Greatly appreciated.


Accepted Solutions
Solution
‎06-30-2017 10:10 AM
Respected Advisor
Posts: 3,785

Re: Splitting string variable into Multiple Column

[ Edited ]

This looks more like reading from infile than spltting a column.  (Same thing I reckon).  Infile magic is a handy alternative.

 

data have;
   input var $80.;
   cards;
1300456#GFA1#2000-11-13#GTO#1455667
1300457#GFA2#2000-11-14#GTO#1455721
1305515#GFA10#2000-11-20#EXP#5154155
;;;;
   run;
proc print;
   run;
data want;
   infile cards dsd dlm='#';
   if _n_ eq 1 then input @@;
   set have;
   _infile_ = var;
   input @1 id:$7. gaf:$4. date:yymmdd. var4:$4. var5 @@;
   format date yymmdd10.;
   cards;
Necessary evil
   run;
proc print;
   run;

Capture.PNG

View solution in original post


All Replies
Super User
Posts: 18,997

Re: Splitting string variable into Multiple Column

Your array is declared as numeric. Add a $ sign after the array statement to create a character array instead that will hold string values. 

Array var(5) $12. ;
Respected Advisor
Posts: 3,156

Re: Splitting string variable into Multiple Column

In this case, you need to define all of your new variables as Char, then convert them if needed during downstrem process, try add:

 

  array var(5) $ 20;
Super User
Posts: 5,352

Re: Splitting string variable into Multiple Column

If you want two of your variables to be numeric, you can't put all the variables into the same array.  With only 5 variables, just hard-code:

 

length var1 $ 7   var2 $ 4   var4 $ 3;   /* or, possibly, select more appropriate lengths */

 

var1 = scan(var, 1, '#');

var2 = scan(var, 2, '#');

var3 = input(scan(var, 3, '#'), yymmdd10.);

format var3 yymmdd10.;

var4 = scan(var4, 4, '#');

var5 = input(scan(var, 5, '#'), 8.);

 

If VAR1 should also be numeric, that could follow a similar pattern as VAR5.

 

If it is possible that there are missing values (and thus two consecutive ##) in your data, SCAN should support an option that allows two consecutive # characters to be treated as separate delimiters.

 

Super User
Super User
Posts: 7,675

Re: Splitting string variable into Multiple Column

Why do you have data like that in one variable?  It sounds like you have imported some data?  If so why not fix the import step, i.e. use # as another delimiter, and this removes the need for the additional step?

Super User
Posts: 9,856

Re: Splitting string variable into Multiple Column

[ Edited ]
data have;
input var $80.;
cards;
1300456#GFA1#2000-11-13#GTO#1455667
1300457#GFA2#2000-11-14#GTO#1455721
1305515#GFA10#2000-11-20#EXP#5154155
;
run;
data temp;
set have;
n+1;
do i=1 to countw(var,'#');
v=scan(var,i,'#','m');output;
end;
drop i;
run;
proc transpose data=temp out=want(drop=_: n) prefix=var;
by n;
var v;
run;
Solution
‎06-30-2017 10:10 AM
Respected Advisor
Posts: 3,785

Re: Splitting string variable into Multiple Column

[ Edited ]

This looks more like reading from infile than spltting a column.  (Same thing I reckon).  Infile magic is a handy alternative.

 

data have;
   input var $80.;
   cards;
1300456#GFA1#2000-11-13#GTO#1455667
1300457#GFA2#2000-11-14#GTO#1455721
1305515#GFA10#2000-11-20#EXP#5154155
;;;;
   run;
proc print;
   run;
data want;
   infile cards dsd dlm='#';
   if _n_ eq 1 then input @@;
   set have;
   _infile_ = var;
   input @1 id:$7. gaf:$4. date:yymmdd. var4:$4. var5 @@;
   format date yymmdd10.;
   cards;
Necessary evil
   run;
proc print;
   run;

Capture.PNG

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 7620 views
  • 3 likes
  • 7 in conversation