BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JacquesR
Quartz | Level 8

Hi all

 

I am trying to use PROC HPTMINE to parse some text.

I have some multi-term lists that I want to be treated as units, and this page "seems" to explain how it should be done:

https://documentation.sas.com/doc/en/tmhpprcref/14.2/tmhpprcref_hptmine_sect008.htm#tmhpprcref.hptmi...

There is also this page, which defines a slightly different format:

https://documentation.sas.com/doc/en/tmref/15.1/n0u9wgweoizcpqn172txiyx8m5h9.htm

This points to the SASHELP.eng_multi dataset, which does seem to follow the latter format.

However, when I run my code, it won't accept a SAS dataset and when I point SAS to a file as the first link suggests, I get an error telling me that the procedure does not recognize my multiword terms, that I should check the multiword format, and that the multiword list will thus be ignored.

I have tried to create a file like this (the headers are commented out because I tried, and failed, with and without the headers):

DATA _NULL_;
    FILE "c:\Users\..\multi_word phrases.txt";
    /*PUT "Term: Token_type: Role";*/
    PUT "Not Recorded: 3: Noun";
    PUT "not recorded: 3: Noun";
    PUT "Potassium Permanganate: 3: Noun";
    PUT "potassium permanganate: 3: Noun";
    PUT "Oxalic Acid: 3: Noun";
    PUT "oxalic acid: 3: Noun";
    PUT "sodium valproate: 3: Noun";
    PUT "small piece: 3: Noun";
    PUT "outer covering: 3: Noun";
    PUT "unknown tablet: 3: Noun";
RUN;

So how exactly should this file be created, because I have tried to follow the hep instructions and just can't figure it out.

I have tried with column-spaced entries, with and without colons, and any other configuration I thought might work.

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
2 REPLIES 2
JosvanderVelden
SAS Super FREQ
Have you seen the example here: https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/tmhpprcref/tmhpprcref_hptmine_examples07.htm

Which version of miner are you using?
Best regards, Jos
JacquesR
Quartz | Level 8

Thanks Jos

Simple as that. No spaces around :3:Noun

Pity that https://documentation.sas.com/doc/en/tmhpprcref/14.2/tmhpprcref_hptmine_sect008.htm#tmhpprcref.hptmi... shows the spaces.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1050 views
  • 1 like
  • 2 in conversation