Hi
Every one
i have data set
word count
lakshmi 10
ads 6
market 5
the 4
to 2
laks 2
is 2
what 2
and 1
help 1
book 1
How to eliminate articles,prepositions and pronous from above data set with out the word.
using base sas
plz help me.
You want to scan this dataset and remove records where the
variable “word” is an articles, prepositions or pronoun?
If the list isn't real long this is a simple way for your example data:
data have;
input word $ count;
datalines;
lakshmi 10
ads 6
market 5
the 4
to 2
laks 2
is 2
what 2
and 1
help 1
book 1
;
run;
data want;
set have;
if upcase(word) in ('THE','A','AN','I','HE','SHE','WE','IT','THEM','TO','AND') then delete;
run;
I use upcase because data could have "The" and "the". Add words as desired. If the list gets real long then creating a data set of unwanted words and a Proc SQL approach might be better.
Look up Natural Language Processing - NLP and see if you can find a list of words that would be considered articles, prepositions, nouns. Then its a simple SQL query to remove them.
Here's a list of 'stop words'
http://jmlr.org/papers/volume5/lewis04a/a11-smart-stop-list/english.stop
Thanks for all your Great help.
Hi reeza sir above link gives list words. how can move to next step. plz help.
Well, for starters it's not sir.
Read in the list from the link <- RemoveList.
Create your word list <- WordList
Remove words from WordList via SQL step:
proc sql;
create table WordList2 as
select *
from WordList
where word NOT IN (select word from RemoveList)
order by word;
quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.