DATA Step, Macro, Functions and more

Parse text String into different Variables

Reply
N/A
Posts: 0

Parse text String into different Variables

I have a variable( around 250 char long) containing text. I wanted to break this text into 4 different variables say each containing around 60 characters. As I'll be printing this text on my report I wanted to break the text exactly between words( something like looking for a first space between letters 55-65, break the text and put it into Variable1 and do the same for rest of the text). I tried to do this with the help of index and scan functions and failed.

Any help is greatly appreciated.

Eddie
Super Contributor
Super Contributor
Posts: 3,174

Re: Parse text String into different Variables

Posted in reply to deleted_user
Yes, the INDEX and SUBSTR functions will identify the offset where the previous or next blank begins, so you can assign your new SAS variables. Share what code you have already attempted to use for forum subscriber feedback/comment.

Scott Barry
SBBWorks, Inc.
SAS Employee
Posts: 174

Re: Parse text String into different Variables

Posted in reply to deleted_user
This is just a quick hack - has not been testet for all cases, but perhaps you can use it as an starting point for your own code.
[pre]
data longsmall(drop=txt t i text x);
length txt t $60;
array smalltxt(6) $60.;
text="I have a variable (around 250 char long) containing text. I wanted to break this text into 4 different variables say each containing around 60 characters. As I ll be printing this text on my report I wanted to break the text exactly between words";
do until (t=" ");
i+1;
t = scan(text,i," ");
if length(catx(" ",txt,t)) < 50 then
txt=catx('%20',txt,t);
else
do;
x+1;
put x= txt;
smalltxt(x)=txt;
txt = t;
end;
end;
x+1;
put x= txt;
smalltxt(x)=txt;
run;

Message was edited by: Geniz

replace %20 in the code with a normal space. The forum cuts my text when I use a space at this place - very strange???
Super Contributor
Posts: 474

Re: Parse text String into different Variables

Posted in reply to deleted_user
You could try this also:

data _null_;

S='AAAA AAAA AAAA AAA AAAAA AAAAAA, AAAAA AAAA AAAAAAAAAAAAA BBB BBBBB BBBBB BBBBB BBBBBB BBBBB BBBBBBB BBBBBBBBBBBB CCCCCCC CCCCC CCCCCCC CCCCCCC CCCCC CCCC CCCCCCCCCC DDDD DDDDD DDDD DDDDDDD DDDDD DDDDDDDD DDDD DDDDD DDDDDD DDDDD DDDD';

I1=1*55+index(substr(S,55),' ')-1; /* 1st split starting at 55 */
I2=2*55+index(substr(S,2*55),' ')-1; /* 2nd split starting at 2x55 */
I3=3*55+index(substr(S,3*55),' ')-1; /* 3rd split starting at 3x55 */
VAR1=substr(S,1,I1);
VAR2=substr(S,I1,I2-I1);
VAR3=substr(S,I2,I3-I2);
VAR4=substr(S,I3);

/* result */
put VAR1=;
put VAR2=;
put VAR3=;
put VAR4=;
run;

Greetings from Portugal.

Daniel Santos at www.cgd.pt
Ask a Question
Discussion stats
  • 3 replies
  • 158 views
  • 0 likes
  • 4 in conversation