Hello experts!
Is there any way to let SAS recognize what a verb is, and extract that verb for me? In my case I have rows of economic news, and I am supposed to extract news specific to speeches, talks, briefings, conferences etc. And I don't want to miss any observation. Being able to identify the verbs in the strings in this case seems to be an easy problem solver. For example, the rows of news would be:
News
Paulson Hosts Conference on Capital Markets in Washington |
Treasury Under Secretary Steel Briefing on Capital Markets |
Treasury's Kimmitt Talks About Trade, the Economy in Berlin |
Fed's Open Market Committee Meets on Interest Rates, Economy |
Fed's Bernanke Speaks at Washington Credit Risk Conference |
POSTPONED: Fed's Kohn Testifies at Hearing on ILCs |
Fed's Kroszner Speaks on Credit Markets in North Carolina |
Fed's Lacker Moderates Panel on Liquidity Risk at Conference |
Fed's Plosser Speaks to N.J. Bankers in West Palm Beach |
Fed's Geithner Speaks at Credit Markets Symposium |
Fed's Lacker Gives Introductory Remarks at Conference |
Fed's Mishkin Speaks on Inflation Dynamics in San Francisco |
Fed's Pianalto Speaks in Prague on Currencies |
Fed's Braunstein Testifies Before House Subcommittee |
Fed's Moskow Speaks in Shanghai on U.S. Monetary Policy |
Treasury's Paulson Testifies to House Panel on Budget Request |
Bernanke Testifies Before Joint Economic Committee |
Treasury's Paulson Testifies to Senate Panel on Budget Request |
POSTPONED: Fed's Kohn Testifies on Industrial Loan Companies |
Fed's Plosser Gives Opening Remarks at Washington Conference |
Bernanke Speaks at Fed Community Development Conference |
U.S. Fed's Williams Moderates Panel on Financial Stability |
Fed's Fisher Speaks in Austin on U.S. Economy |
Fed's Mishkin Speaks at Bridgewater College in Virginia |
Fed's Fisher Speaks on Topic to Be Determined in McAllen |
Fed's Plosser Speaks on Federal Reserve in Delaware |
Fed's Lacker Speaks to Economists in North Carolina |
Bernanke Speaks on Market Discipline, Regulation in New York |
Fed's Moskow Speaks on Economic Outlook in Illinois |
Treasury's Adams Holds Press Conference Ahead of G-7 Meeting |
Fed's Fisher Speaks in Houston on Globalization |
Paulson Holds a Press Conference After G-7 Meets in Washington |
… … … … …
What would you say about how efficiently I can do this?
Much thanks!
Regards.
Are you using SAS EM with the Text Analysis or Via with Text analysis? If not, do you have a list of 'verbs' that you're planning to identify? Or in general.
It's possible, I usually parse the sentence to each work, link each word to a lookup table - may need to find the stem for each word, ie running becomes run.
This is an example of how I did it for finding sentiment, but the concept is similar. I have vague recollections that once upon a time SAS had either a Dictionary or Spelling PROC that had this, but not sure of the current status quo in SAS Base.
*Create sample data;
data random_sentences;
infile cards truncover;
informat sentence $256.;
input sentence $256.;
cards;
This is a random sentence
This is another random sentence
Happy Birthday
My job sucks.
This is a good idea, not.
This is an awesome idea!
How are you today?
Does this make sense?
Have a great day!
;
;
;
;
*Partition into words;
data f1;
set random_sentences;
id=_n_;
nwords=countw(sentence);
nchar=length(compress(sentence));
do word_order=1 to nwords;
word=scan(sentence, word_order);
output;
end;
run;
*Add happiness index and pos;
proc sql ;
create table scored as
select a.*, b.happiness_rank, c.pos, c.pos1
from f1 as a
left join ta.sentiment as b
on a.word=b.word
left join ta.corpus as c
on a.word=c.word
order by sentence, word_order;
quit;
*Calculate sentence happiness score;
proc sql;
create table sentence_sentiment as
select distinct sentence, sum(happiness_rank) as sentiment
from scored
group by id;
quit;
@d6k5d3 wrote:
Hello experts!
Is there any way to let SAS recognize what a verb is, and extract that verb for me? In my case I have rows of economic news, and I am supposed to extract news specific to speeches, talks, briefings, conferences etc. And I don't want to miss any observation. Being able to identify the verbs in the strings in this case seems to be an easy problem solver. For example, the rows of news would be:
News
Paulson Hosts Conference on Capital Markets in Washington Treasury Under Secretary Steel Briefing on Capital Markets Treasury's Kimmitt Talks About Trade, the Economy in Berlin Fed's Open Market Committee Meets on Interest Rates, Economy Fed's Bernanke Speaks at Washington Credit Risk Conference POSTPONED: Fed's Kohn Testifies at Hearing on ILCs Fed's Kroszner Speaks on Credit Markets in North Carolina Fed's Lacker Moderates Panel on Liquidity Risk at Conference Fed's Plosser Speaks to N.J. Bankers in West Palm Beach Fed's Geithner Speaks at Credit Markets Symposium Fed's Lacker Gives Introductory Remarks at Conference Fed's Mishkin Speaks on Inflation Dynamics in San Francisco Fed's Pianalto Speaks in Prague on Currencies Fed's Braunstein Testifies Before House Subcommittee Fed's Moskow Speaks in Shanghai on U.S. Monetary Policy Treasury's Paulson Testifies to House Panel on Budget Request Bernanke Testifies Before Joint Economic Committee Treasury's Paulson Testifies to Senate Panel on Budget Request POSTPONED: Fed's Kohn Testifies on Industrial Loan Companies Fed's Plosser Gives Opening Remarks at Washington Conference Bernanke Speaks at Fed Community Development Conference U.S. Fed's Williams Moderates Panel on Financial Stability Fed's Fisher Speaks in Austin on U.S. Economy Fed's Mishkin Speaks at Bridgewater College in Virginia Fed's Fisher Speaks on Topic to Be Determined in McAllen Fed's Plosser Speaks on Federal Reserve in Delaware Fed's Lacker Speaks to Economists in North Carolina Bernanke Speaks on Market Discipline, Regulation in New York Fed's Moskow Speaks on Economic Outlook in Illinois Treasury's Adams Holds Press Conference Ahead of G-7 Meeting Fed's Fisher Speaks in Houston on Globalization Paulson Holds a Press Conference After G-7 Meets in Washington … … … … …
What would you say about how efficiently I can do this?
Much thanks!
Regards.
It is worth noting that SAS is not a word processor. As far as I'm aware SAS does not contain a grammatical rule engine which would enable recognising word types.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.