Hi all
I am trying to use PROC HPTMINE to parse some text.
I have some multi-term lists that I want to be treated as units, and this page "seems" to explain how it should be done:
There is also this page, which defines a slightly different format:
https://documentation.sas.com/doc/en/tmref/15.1/n0u9wgweoizcpqn172txiyx8m5h9.htm
This points to the SASHELP.eng_multi dataset, which does seem to follow the latter format.
However, when I run my code, it won't accept a SAS dataset and when I point SAS to a file as the first link suggests, I get an error telling me that the procedure does not recognize my multiword terms, that I should check the multiword format, and that the multiword list will thus be ignored.
I have tried to create a file like this (the headers are commented out because I tried, and failed, with and without the headers):
DATA _NULL_;
FILE "c:\Users\..\multi_word phrases.txt";
/*PUT "Term: Token_type: Role";*/
PUT "Not Recorded: 3: Noun";
PUT "not recorded: 3: Noun";
PUT "Potassium Permanganate: 3: Noun";
PUT "potassium permanganate: 3: Noun";
PUT "Oxalic Acid: 3: Noun";
PUT "oxalic acid: 3: Noun";
PUT "sodium valproate: 3: Noun";
PUT "small piece: 3: Noun";
PUT "outer covering: 3: Noun";
PUT "unknown tablet: 3: Noun";
RUN;
So how exactly should this file be created, because I have tried to follow the hep instructions and just can't figure it out.
I have tried with column-spaced entries, with and without colons, and any other configuration I thought might work.
Thanks
Thanks Jos
Simple as that. No spaces around :3:Noun
Pity that https://documentation.sas.com/doc/en/tmhpprcref/14.2/tmhpprcref_hptmine_sect008.htm#tmhpprcref.hptmi... shows the spaces.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.