DATA Step, Macro, Functions and more

Split up text in one line to several lines

Reply
Occasional Contributor
Posts: 9

Split up text in one line to several lines

Hey,

I have the following problem: my data looks like:

ident | text
1 | sentence1. sentence2. sentence3. (...)sentenceN.
2 | sentence1. sentence2. sentence3. (...)sentenceM.

I want to transform it into:

ident | text
1 | sentence1.
1 | sentence2.
1 | sentence3.
(...)
1 | sentenceN.
2 | sentence1.
2 | sentence2.
2 | sentence3.
(...)
2 | sentenceM.

Thus, the original variable text consists in each observation of a number of sentences which are delimited by a ".". Now, I want to create for each sentence a unique observation. Do you have an idea how to do this?

Thanks,

Valentin
Super Contributor
Super Contributor
Posts: 365

Re: Split up text in one line to several lines

Hello Valentin,

This is a solution:
[pre]
daia i;
input ident 1-1 text $ 3-34;
datalines;
1 sentence1. sentence2. sentence3.
2 sentence5. sentence6. sentence7.
run;
data r (rename=(t=text));
set i;
length t $100.;
if first.ident then i=0;
do until (t="");
i+1;
t=scan(text,i,".");
if t ne "" then output;
end;
by ident;
keep ident t;
run;
[/pre]
Sincerely,
SPR
Occasional Contributor
Posts: 9

Re: Split up text in one line to several lines

Thanks a lot!
Super User
Posts: 9,691

Re: Split up text in one line to several lines

[pre]
data i;
input ident 1-1 text $ 3-34;
datalines;
1 sentence1. sentence2. sentence3.
2 sentence5. sentence6. sentence7.
run;
data temp;
set i;
i=1;
_text=scan(text,1,'.');
do while(not missing(_text));
output;
i+1;
_text=scan(text,i,'.');
end;
drop text i;
run;
[/pre]




Ksharp
Ask a Question
Discussion stats
  • 3 replies
  • 176 views
  • 0 likes
  • 3 in conversation