BookmarkSubscribeRSS Feed
carlosGoetz
Calcite | Level 5

Good morning everybody.

 

I need to analyze a huge number of legal documents in order to find out which ones have certain clauses and which ones don't. I'd like to know how to proceed. I'm using Visual Text Analytics on SAS Viya 3.4, but it seems to me that it's impossible to do something like that.

 

Can you help me with this issue, please?

 

Thank you very much!

3 REPLIES 3
Jason7
SAS Employee

Hello Carlos - 

 

you can import many PDF files into Viya to use in VTA using the data import function:

https://go.documentation.sas.com/?docsetId=datahub&docsetTarget=p1sv89vo4n8f03n0zvq0k90i8g3t.htm&doc...

 

in VTA, you can define rules to categorize documents that include certain clauses you require:

https://go.documentation.sas.com/?activeCdc=ctxtcdc&cdcId=capcdc&cdcVersion=8.4&docsetId=ctxtug&docs...

 

in VTA, it tests your model on the PDF files, but you can also apply the model onto new data / scoring process here:

https://go.documentation.sas.com/?activeCdc=ctxtcdc&cdcId=capcdc&cdcVersion=8.4&docsetId=ctxtug&docs...

 

hope it helps!

carlosGoetz
Calcite | Level 5

Thank you very much.

 

I have another question: If I only have to check if a bunch of documents have or don't have the word "Wexner", can I just create a pipeline with just the two nodes: Data and Categories?

 

I've created a category Bueno that says (NOT,("Wexner")), but when I run the node I obtain the next error message:

Se ha producido un error mientras se ejecutaba el pipeline. Consulte los registros del nodo para más detalles.

 

... and the log says:

Exception occurred while querying categories table: category document table with the specified taxonomyId not found: 4a7f7f286c4c6558016c751433ff0004

 

Can you tell me what I'm doing wrong?

 

Also, you can find attached an image about matches on a document. Can you tell me why if there are 3 out of 4 documents that contains the word Anova as listed in lower part of the screen, at the right I see 0 matches? What does it mean?

 

Thank you very much!

Best regards,

Carlos

carlosGoetz
Calcite | Level 5
Can anybody help me with this issue, please?
Thank you very much.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1211 views
  • 0 likes
  • 2 in conversation