BookmarkSubscribeRSS Feed
LipizinPinto
Calcite | Level 5

Hello, I need help with a question, please. Here is the scenario:

I will be performing a Text Mining on content extracted from Facebook, and the textual data was imported to an Excel file. The thing is: This single Excel file has a column named "Posts" and 5.000 rows filled with posts and comments. Can I import to "Text Import" node the file just the way it is, and I'm ready to move on to Text Parsing?! Or should I have one separated Excel file for EACH one of the 5.000 texts?! If so, how can I automate this generation of 5.000 Excel files containing a single text on each one?!

Thanks in advance.

3 REPLIES 3
Shmuel
Garnet | Level 18
Please post a sample (few lines) of your input
and clarify what do you want to do with it - what output expected.
LipizinPinto
Calcite | Level 5

Hello, really thanks for replying back to me.

Here is the scenario: I have a folder containing only one Excel file (below) full of tweets extracted from Twitter, and I want to use it as input for my Text Mining analysis.

Capturar1.PNG

Opening this Excel file, you can check a sample of its data (below), which has a lots of rows containing the textual data to be analyzed (where each row contains a different tweet, and therefore we can consider as having N documents inside a single Excel file).

Capturar2.PNG

My question is: When using "Text Import" node, can I inport this single Excel file and expect SAS Text Miner to understand that each row is a different text (document) to be analyzed, or am I supposed to have each one of these documents (rows) saved separately?!

 

For example: let's suppose inside this Excel file I have 150 rows containing differents tweets, can I inport this single file with its 150 rows or should I have 150 excel files (1 for each row)?

 

Hope I could express myself better this time!

Thanks in advance!

avp
SAS Employee avp
SAS Employee

Nice elaboration of your requirement. It helped.

 

I think you can try with "File Import Node " from  the "Sample" tab

 

File Import Node -> Text Parsing Node -> Text Filter Node ....etc.

 

 

In File Import Node do  "Right  click.."  -> "Edit Variables...."   and change the Role of "Tweets" variable to "Text"

 

That should run your  Text Mining flow ,

 taking each row from your excel as one individual document.   (First row "Tweets" will be the variable name )

 

Thanks.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1052 views
  • 0 likes
  • 3 in conversation