BookmarkSubscribeRSS Feed
Bhuvaneswari
Obsidian | Level 7

Hello,

 

I'm working on an excel that has multiple text fields of varying lengths (some fields have a few sentences and some have few paragraphs in it). Each record contains information/observations pertaining to a specific industry and each field uniquely indentifies certain predefinied charecteristic pertaining to that industry. I'm looking for ways to explore this dataset, identify those features specific to the type of record(each record can be categorized into 3 groups which is also available as one of the text fields in the dataset). I was trying different means to mine this text data and ran into several questions in the process.

 

Using file import I brought the dataset into sas and after parsing I noticed that only one field that has longest width is chosen as the attribute under observation and rest of them are ignored(I couldnt find them in text filter node). But I wanted to include terms from other fileds as well(Merging those fields is not an option as each field has its own unique charecteristic as mentioned above). Text topic and clustering is giving more generic information which I dont think add much value to the knowledge discovery process. What is a way to effectively mine this text data so that the least possible information is compromised through mining?

 

I couldnt share the data due to security issues but if you have further questions on dataset being used, feel free to update the thread and I will be glad to add information to it. I'm a newbie in this field so any kind of help is very much appreciated. Thank you.

1 REPLY 1
lakshmi_74
Quartz | Level 8

Please review this link. Definitely you can get through.

http://analytics.ncsu.edu/sesug/2007/HW07.pdf

eg.

X 'C:\EXCEL SASCONF.XLS';

FILENAME SASCONF DDE 'Excel|[Book1]Sheet1!R1C1:R9C4'; /* DDE EXAMPLE - Excel */

DATA SasConf;

INFILE SASCONF;

INPUT ConfName $ ConfYear ConfCity $ ConfST $ ;

RUN;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 845 views
  • 0 likes
  • 2 in conversation