BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JacquesR
Quartz | Level 8

Hi all

 

I am trying to use PROC HPTMINE to parse some text.

I have some multi-term lists that I want to be treated as units, and this page "seems" to explain how it should be done:

https://documentation.sas.com/doc/en/tmhpprcref/14.2/tmhpprcref_hptmine_sect008.htm#tmhpprcref.hptmi...

There is also this page, which defines a slightly different format:

https://documentation.sas.com/doc/en/tmref/15.1/n0u9wgweoizcpqn172txiyx8m5h9.htm

This points to the SASHELP.eng_multi dataset, which does seem to follow the latter format.

However, when I run my code, it won't accept a SAS dataset and when I point SAS to a file as the first link suggests, I get an error telling me that the procedure does not recognize my multiword terms, that I should check the multiword format, and that the multiword list will thus be ignored.

I have tried to create a file like this (the headers are commented out because I tried, and failed, with and without the headers):

DATA _NULL_;
    FILE "c:\Users\..\multi_word phrases.txt";
    /*PUT "Term: Token_type: Role";*/
    PUT "Not Recorded: 3: Noun";
    PUT "not recorded: 3: Noun";
    PUT "Potassium Permanganate: 3: Noun";
    PUT "potassium permanganate: 3: Noun";
    PUT "Oxalic Acid: 3: Noun";
    PUT "oxalic acid: 3: Noun";
    PUT "sodium valproate: 3: Noun";
    PUT "small piece: 3: Noun";
    PUT "outer covering: 3: Noun";
    PUT "unknown tablet: 3: Noun";
RUN;

So how exactly should this file be created, because I have tried to follow the hep instructions and just can't figure it out.

I have tried with column-spaced entries, with and without colons, and any other configuration I thought might work.

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
2 REPLIES 2
JosvanderVelden
SAS Super FREQ
Have you seen the example here: https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/tmhpprcref/tmhpprcref_hptmine_examples07.htm

Which version of miner are you using?
Best regards, Jos
JacquesR
Quartz | Level 8

Thanks Jos

Simple as that. No spaces around :3:Noun

Pity that https://documentation.sas.com/doc/en/tmhpprcref/14.2/tmhpprcref_hptmine_sect008.htm#tmhpprcref.hptmi... shows the spaces.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 681 views
  • 1 like
  • 2 in conversation