BookmarkSubscribeRSS Feed
d6k5d3
Pyrite | Level 9

Hello experts!

 

Is there any way to let SAS recognize what a verb is, and extract that verb for me? In my case I have rows of economic news, and I am supposed to extract news specific to speeches, talks, briefings, conferences etc. And I don't want to miss any observation. Being able to identify the verbs in the strings in this case seems to be an easy problem solver. For example, the rows of news would be:

 

News

 

Paulson Hosts Conference on Capital Markets in Washington
Treasury Under Secretary Steel Briefing on Capital Markets
Treasury's Kimmitt Talks About Trade, the Economy in Berlin
Fed's Open Market Committee Meets on Interest Rates, Economy
Fed's Bernanke Speaks at Washington Credit Risk Conference
POSTPONED: Fed's Kohn Testifies at Hearing on ILCs
Fed's Kroszner Speaks on Credit Markets in North Carolina
Fed's Lacker Moderates Panel on Liquidity Risk at Conference
Fed's Plosser Speaks to N.J. Bankers in West Palm Beach
Fed's Geithner Speaks at Credit Markets Symposium
Fed's Lacker Gives Introductory Remarks at Conference
Fed's Mishkin Speaks on Inflation Dynamics in San Francisco
Fed's Pianalto Speaks in Prague on Currencies
Fed's Braunstein Testifies Before House Subcommittee
Fed's Moskow Speaks in Shanghai on U.S. Monetary Policy
Treasury's Paulson Testifies to House Panel on Budget Request
Bernanke Testifies Before Joint Economic Committee
Treasury's Paulson Testifies to Senate Panel on Budget Request
POSTPONED: Fed's Kohn Testifies on Industrial Loan Companies
Fed's Plosser Gives Opening Remarks at Washington Conference
Bernanke Speaks at Fed Community Development Conference
U.S. Fed's Williams Moderates Panel on Financial Stability
Fed's Fisher Speaks in Austin on U.S. Economy
Fed's Mishkin Speaks at Bridgewater College in Virginia
Fed's Fisher Speaks on Topic to Be Determined in McAllen
Fed's Plosser Speaks on Federal Reserve in Delaware
Fed's Lacker Speaks to Economists in North Carolina
Bernanke Speaks on Market Discipline, Regulation in New York
Fed's Moskow Speaks on Economic Outlook in Illinois
Treasury's Adams Holds Press Conference Ahead of G-7 Meeting
Fed's Fisher Speaks in Houston on Globalization
Paulson Holds a Press Conference After G-7 Meets in Washington

… … … … …

 

What would you say about how efficiently I can do this?

 

Much thanks!

 

Regards.

4 REPLIES 4
Reeza
Super User

Are you using SAS EM with the Text Analysis or Via with Text analysis? If not, do you have a list of 'verbs' that you're planning to identify? Or in general. 

 

It's possible, I usually parse the sentence to each work, link each word to a lookup table - may need to find the stem for each word, ie running becomes run. 

This is an example of how I did it for finding sentiment, but the concept is similar. I have vague recollections that once upon a time SAS had either a Dictionary or Spelling PROC that had this, but not sure of the current status quo in SAS Base. 

 

*Create sample data;
data random_sentences;
    infile cards truncover;
    informat sentence $256.;
    input sentence $256.;
    cards;
This is a random sentence
This is another random sentence
Happy Birthday
My job sucks.
This is a good idea, not.
This is an awesome idea!
How are you today?
Does this make sense?
Have a great day!
;
    ;
    ;
    ;

*Partition into words;
data f1;
    set random_sentences;
    id=_n_;
    nwords=countw(sentence);
    nchar=length(compress(sentence));

    do word_order=1 to nwords;
        word=scan(sentence, word_order);
        output;
    end;
run;

*Add happiness index and pos;

proc sql ;
    create table scored as 
    select a.*, b.happiness_rank, c.pos, c.pos1
    from f1 as a 
    left join ta.sentiment as b 
    on a.word=b.word 
    left join ta.corpus as c
    on a.word=c.word
    order by sentence, word_order;
quit;

*Calculate sentence happiness score;
proc sql;
create table sentence_sentiment as
select distinct sentence, sum(happiness_rank) as sentiment
from scored
group by id;
quit;

 


@d6k5d3 wrote:

Hello experts!

 

Is there any way to let SAS recognize what a verb is, and extract that verb for me? In my case I have rows of economic news, and I am supposed to extract news specific to speeches, talks, briefings, conferences etc. And I don't want to miss any observation. Being able to identify the verbs in the strings in this case seems to be an easy problem solver. For example, the rows of news would be:

 

News

 

Paulson Hosts Conference on Capital Markets in Washington
Treasury Under Secretary Steel Briefing on Capital Markets
Treasury's Kimmitt Talks About Trade, the Economy in Berlin
Fed's Open Market Committee Meets on Interest Rates, Economy
Fed's Bernanke Speaks at Washington Credit Risk Conference
POSTPONED: Fed's Kohn Testifies at Hearing on ILCs
Fed's Kroszner Speaks on Credit Markets in North Carolina
Fed's Lacker Moderates Panel on Liquidity Risk at Conference
Fed's Plosser Speaks to N.J. Bankers in West Palm Beach
Fed's Geithner Speaks at Credit Markets Symposium
Fed's Lacker Gives Introductory Remarks at Conference
Fed's Mishkin Speaks on Inflation Dynamics in San Francisco
Fed's Pianalto Speaks in Prague on Currencies
Fed's Braunstein Testifies Before House Subcommittee
Fed's Moskow Speaks in Shanghai on U.S. Monetary Policy
Treasury's Paulson Testifies to House Panel on Budget Request
Bernanke Testifies Before Joint Economic Committee
Treasury's Paulson Testifies to Senate Panel on Budget Request
POSTPONED: Fed's Kohn Testifies on Industrial Loan Companies
Fed's Plosser Gives Opening Remarks at Washington Conference
Bernanke Speaks at Fed Community Development Conference
U.S. Fed's Williams Moderates Panel on Financial Stability
Fed's Fisher Speaks in Austin on U.S. Economy
Fed's Mishkin Speaks at Bridgewater College in Virginia
Fed's Fisher Speaks on Topic to Be Determined in McAllen
Fed's Plosser Speaks on Federal Reserve in Delaware
Fed's Lacker Speaks to Economists in North Carolina
Bernanke Speaks on Market Discipline, Regulation in New York
Fed's Moskow Speaks on Economic Outlook in Illinois
Treasury's Adams Holds Press Conference Ahead of G-7 Meeting
Fed's Fisher Speaks in Houston on Globalization
Paulson Holds a Press Conference After G-7 Meets in Washington

… … … … …

 

What would you say about how efficiently I can do this?

 

Much thanks!

 

Regards.


 

d6k5d3
Pyrite | Level 9
This seems to be a spell checker. I have no issues with spelling. I need SAS to recognize which words are verbs. Googling PROC SPELL doesn't give much insight.
SASKiwi
PROC Star

It is worth noting that SAS is not a word processor. As far as I'm aware SAS does not contain a grammatical rule engine which would enable recognising word types.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 902 views
  • 2 likes
  • 3 in conversation