BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PGStats
Opal | Level 21

Please post your new idea as a new discussion...

PG

PG
Zachary
Obsidian | Level 7

Sorry - did not see your response there...

Zachary
Obsidian | Level 7

I am very close to a final solution:. I think I can take care of it with a lot of labels:

proc sql;
  create table scoresL as
  select CLAIMNO, COMMENTTEXT, Text4Matching, NumberForLabels
    from
    SRS_Comments500 inner join
    TextFromExcel on indexw(COMMENTTEXT, Text4Matching)>0
  order by CLAIMNO, COMMENTTEXT, Text4Matching;
quit;
proc print data=scoresL noobs; run;

proc transpose data=scoresL out=FlaggedCommentsT(drop=_:) prefix=flag_;
  by CLAIMNO;
  id NumberForLabels;
run;
proc print data=FlaggedCommentsT noobs; run;

I just need to create a routine to label each of the numbered fields.

But, I am getting lots of errors involved with this:

The ID value, "flag_6" occurs twice in the same BY group.

I cannot figure what is happening here. Any suggestions?

gergely_batho
SAS Employee

Have you tried to use the tables that  are created by the Text Mining Nodes? Those create exactly the same tables (usually in long format, not transposed).

Also, if you have ideas, I would suggest posting into:  "Text and Content Analytics" forum. (I wonder what text miners think about measuring pairwise correlations and selecting the best ~100 variables for predictive modeling.)

If you have technical questions (like this one), this forum is adequate.

Zachary
Obsidian | Level 7

Thank you Gergely. Yes, I have tried using the tables created using the Text Mining Nodes.

The unfortunate part about the Tex and Content Analytics forum is that there is not usually a lot of traffic in there.

But I promise to keep my questions within this forum on the technical side. Thank you so much!

Zachary
Obsidian | Level 7

Below is the final code that accomplishes what I needed. I have a few comments interspersed throughout here, but please let me know if you have any further questions or concerns.

And thank you to everyone who chimed in here:

proc import out = TextFromExcel datafile = "\\SAS01\sasusers\Shared\PredictiveModeling\PredictiveModelingCPM\TextToSAS.xlsx" dbms=xlsx replace;
  sheet="Sheet1";
  getnames=yes;
run;

data TextFromExcel;
  set TextFromExcel;
  Text = lowcase(Text);
  rename Text = Text4Matching;
run;

* Duplicates were removed in Excel;

libname Comments "\\SAS01\sasusers\Shared\PredictiveModeling\PredictiveModelingCPM\CommentsAnalysis";

* Long version, instead of wide;
proc sql;
  create table scoresL as
  select CLAIMNO, COMMENTTEXT, Text4Matching, NumberForLabels
    from
    SRS_Comments500 inner join
    TextFromExcel on indexw(COMMENTTEXT, Text4Matching)>0
  order by CLAIMNO, COMMENTTEXT, Text4Matching;
quit;
proc print data=scoresL noobs; run;

* Removing situations where a particular code duplicates across comments - Be sure to also remove the comment so it is not an issue now either;
proc sort data = scoresL (drop = COMMENTTEXT Text4Matching) nodupkey;
  by CLAIMNO NumberForLabels;
run;

* Getting a column of 1s in here so that when the data is transposed it has the pattern of 1s for each CLAIMNO;
data scoresL;
  set scoresL;
  Dichotomous_Counter = 1;
run;

* But I need to reorganize this to be able to spread it out to be wide now;
proc transpose data=scoresL out=FlaggedCommentsT prefix=flag_ name=Transposed_Column;
  by CLAIMNO;
  id NumberForLabels;
run;
proc print data=FlaggedCommentsT noobs; run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 3502 views
  • 4 likes
  • 7 in conversation