BookmarkSubscribeRSS Feed
Valentin_HU
Calcite | Level 5
Hey,

I have the following problem: my data looks like:

ident | text
1 | sentence1. sentence2. sentence3. (...)sentenceN.
2 | sentence1. sentence2. sentence3. (...)sentenceM.

I want to transform it into:

ident | text
1 | sentence1.
1 | sentence2.
1 | sentence3.
(...)
1 | sentenceN.
2 | sentence1.
2 | sentence2.
2 | sentence3.
(...)
2 | sentenceM.

Thus, the original variable text consists in each observation of a number of sentences which are delimited by a ".". Now, I want to create for each sentence a unique observation. Do you have an idea how to do this?

Thanks,

Valentin
3 REPLIES 3
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello Valentin,

This is a solution:
[pre]
daia i;
input ident 1-1 text $ 3-34;
datalines;
1 sentence1. sentence2. sentence3.
2 sentence5. sentence6. sentence7.
run;
data r (rename=(t=text));
set i;
length t $100.;
if first.ident then i=0;
do until (t="");
i+1;
t=scan(text,i,".");
if t ne "" then output;
end;
by ident;
keep ident t;
run;
[/pre]
Sincerely,
SPR
Valentin_HU
Calcite | Level 5
Thanks a lot!
Ksharp
Super User
[pre]
data i;
input ident 1-1 text $ 3-34;
datalines;
1 sentence1. sentence2. sentence3.
2 sentence5. sentence6. sentence7.
run;
data temp;
set i;
i=1;
_text=scan(text,1,'.');
do while(not missing(_text));
output;
i+1;
_text=scan(text,i,'.');
end;
drop text i;
run;
[/pre]




Ksharp
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1266 views
  • 0 likes
  • 3 in conversation