BookmarkSubscribeRSS Feed
SAS93
Quartz | Level 8

I'm working with qualitative data, where a variable COMMENT represents a string of text written in by a study participant. Some comments aren't in full sentences, but some have several sentences. 

 

I've figured out how to split the string at any period it finds into a separate row & retain an ID value for that (so I can manually match up the comment "pieces" to the original full comment if needed):

 

	do i=1 by 1 while(scan(COMMENT,i,'.')^=' ');
	new=scan(COMMENT,i,'.');
	retain ID;
	output;
	end;

While looking through my first run of this code, I realized there is a problem: some comments will include things like "Mr." or a decimal like "12.1". 

 

What's the easiest way to add a layer to this code that will split the string only at the end of an actual sentence (delimited by a period)?

1 REPLY 1
ChrisHemedinger
Community Manager

To reliably tokenize a body of text into sentences, you need a tool that understands how to break your data into its discrete concepts. SAS Text Analytics has this ability -- see this article for details. These tools can also help with concept categorization and sentiment.

 

I'm assuming you don't have access to that so you really would just have to code for the most common special cases. Like the "Mr." or "12.1" cases you mentioned -- having your code recognize these patterns and not break the sentence on those boundaries. The list of exceptions could get pretty large if your text is diverse.

Check out SAS Innovate on-demand content! Watch the main stage sessions, keynotes, and over 20 technical breakout sessions!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 601 views
  • 0 likes
  • 2 in conversation