06-18-2015 08:42 AM
i have data set
How to eliminate articles,prepositions and pronous from above data set with out the word.
using base sas
plz help me.
06-19-2015 03:26 PM
You want to scan this dataset and remove records where the
variable “word” is an articles, prepositions or pronoun?
06-19-2015 03:54 PM
If the list isn't real long this is a simple way for your example data:
input word $ count;
if upcase(word) in ('THE','A','AN','I','HE','SHE','WE','IT','THEM','TO','AND') then delete;
I use upcase because data could have "The" and "the". Add words as desired. If the list gets real long then creating a data set of unwanted words and a Proc SQL approach might be better.
06-19-2015 04:47 PM
Look up Natural Language Processing - NLP and see if you can find a list of words that would be considered articles, prepositions, nouns. Then its a simple SQL query to remove them.
Here's a list of 'stop words'
06-22-2015 03:21 AM
Well, for starters it's not sir.
Read in the list from the link <- RemoveList.
Create your word list <- WordList
Remove words from WordList via SQL step:
create table WordList2 as
where word NOT IN (select word from RemoveList)
order by word;