SAS Visual Analytics provides you with a pre-built list of words that you would probably want your text analytics to ignore. Actually, there are two lists--one for English and one for German! A stop list enables you to filter out noise in your analysis by ignoring certain irrelevant or commonly used words. By eliminating some commonly used words, such as "a", "and", and "the", you can filter out noise from your analysis.
In order to use a stop list, it must be loaded into memory. The Data builder provides an option designed especially for that purpose. To load one of the stop lists in the data builder, select Tool-->Load Text Analytics Stop List.
You are given the opportunity to specify which list you would like to load (English or German) and you can specify the metadata registration location and the LASR library.
A table named ENGSTOPL or GRMSTOPS is registered in the location and library that you specify. SAS Visual Analytics supports one stop list for each SAS LASR Analytic Server. You load the stop list (which is a table) to memory by performing the previous steps. If more than one library is registered for SAS LASR Analytic Server, you can use any of them. If you load a stop list more than once or use more than one library, the server uses the last stop list that was loaded to memory. Once you load a stop list, you will see the list in the LASR Tables tab in the administrator. Here is the first 20 words of the English stop list displayed in a list table in the designer. The entire table contains 509 unique words.
A site may be tempted to modify the stop list by adding its own values, or even replace the stop list with its own custom list of values. That may work, but it's important to let your site know that custom stop lists are not formally supported as of the 7.3 release of SAS Visual Analytics. Once one of the provided stop lists are loaded, you will be ready to make use of the stop lists in your word clouds and text analytics.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.