Text analytics and Generative Artificial Intelligence (GenAI) in clinical trial protocol processes

2 Likes

So what did we do?

Researchers often encounter difficulties in comprehending and integrating information from disparate protocols, hindering the progress of scientific inquiry.

This use case is aimed at creating a Unified Studies Definition Model (USDM) by translating diverse human-readable protocols into a standardized format.

The envisaged solution combines the versatility of Excel spreadsheets and the efficiency of Natural Language Processing (NLP) techniques to bridge the gap between heterogeneous study designs.

The primary objective of this use case is to

enhance the clarity,
accessibility, and
interoperability of research protocols

This should lead to development of a standardized, machine-readable representation USDM.

The following steps are used in the use case:

Used contextual information extraction from a couple of clinical trial protocols (early and late phase) and stored the relevant info into the USDM (excel) workbook ... in the right place (in the right field). Key steps included contextual information extraction from clinical trial protocols and ensuring scalability to handle various protocols.
Employing SAS Natural Language Processing (NLP) techniques and Large Language Models (LLM's); automatically translate human-readable study protocols into a structured format.
- This involves extracting key information such as study objectives, methodologies, inclusion/exclusion criteria, and outcomes.
- A robust approach has been developed, combining natural language processing (NLP), text analytics and a large language model (LLM) to handle clinical trial protocols.
Whenever LLMs are involved, it's essential to discuss cost, security, and privacy.
However, as a life sciences advisor, the primary concern is:
- "How can I trust a generative LLM to deliver reliable, non-hallucinated results?
- How can I establish guardrails to ensure this?
- How do I address the security and privacy of my data?"

Evaluating Gen AI and Text Analytics for creating a USDM

Some initial results after pre processing, highlights the extractions of inclusion and exclusion criteria by LITI rules and LLMs.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

There are several challenges to overcome, such as overfitting in LITI rules and the need for extensive pre-processing.

The use of various Python packages for splitting protocols into chunks and extracting tables and ima...

Preliminary Outcomes:

The use of LLMs like LLAMA-V2 and RAG-for-LLMs proved beneficial, though resource-intensive.
Leveraging LITI rules as pre-filters, for confidence scoring, to reduce costs by pre-filtering data and thereby leading to quicker and accurate results seems beneficial.

A bit more on LITI rules for confidence scoring:

Confidence Scoring in SAS Viya VTA – LITI

The same corpus used for information extraction with the LLM is scored against the LITI rule to verify the presence of the information.

A confidence score, ranging from 0 to 1, is calculated based on the number of relevant terms matched by the LITI rule.

Higher scores indicate better extraction quality, while lower scores suggest hallucinations or inaccuracies.

This method provides a robust metric to evaluate different LLM models or prompts. (5 steps to improve information extraction using trustworthy generative AI - The SAS Data Science Blo...)

Benefits of integrating SAS NLP and Gen AI for creating a USDM

Avoiding hallucinations: The NLP and text analytics pre-filtering process assimilates the most relevant source data from various documents, ensuring the outputs are more accurate and reliable.
Enhancing time to value: By pre-filtering the data, a smaller LLM can handle GenAI tasks more efficiently, leading to quicker results. Providing more focused context to an LLM significantly enhances output quality, especially for weaker models.
Ensuring privacy and security: Using a local vector database for fine-tuning generative models is possible. This gives users only relevant embeddings to the LLMs via APIs or localized instances of the LLM, ensuring the privacy and security of sensitive data.
Reducing costs: Text analytics and NLP significantly reduce the amount of information sent to the LLMs. In some cases, only 1 – 5% of the overall data is used for answers. This eliminates the need for excessive external API calls and reduces the computational resources required for localized LLMs. Processing documents with GPT-4 involves significant costs. For documents ranging from 3,500 to 8,000 tokens, processing 1,000 documents costs between $105 and $240 for input alone. The output cost is often 2-3 times higher per token than the input. With GPT-4 pricing at $30 per million tokens, these costs can add up quickly.
Traceability: SAS® Viya® enables end-to-end verification and traceability of results, helping users to verify information and trace it back to the statements from which the summaries were derived, potentially thousands of statements. This traceability feature enhances transparency and trust in the generated outputs.
Establishing guard rails: establish guard rails to control the information that’s sent to the LLMs and also analyse the output – quality checks. Adopting a “trust but verify” approach ensures that LLMs’ extractions, which can impact downstream tasks, are checked and validated to prevent unchecked errors.

So what? - A few open ended conclusions

The work highlights the potential of advanced data management techniques in transforming clinical trial protocols.
By embracing innovative technologies and collaborative efforts, the industry can achieve greater efficiency and accuracy in clinical research.
These examples merely hint at the vast potential unlocked when combining the precision of linguistic methods in SAS NLP with LLMs.
These techniques not only tackle quality issues in text data but also integrate subject matter expertise, granting organizations significant control over their corpora.
In some instances, the corpus size for fine-tuning can be reduced by up to 90%.
By curating higher-quality data for fine-tuning, you can achieve more accurate responses from LLMs, reduce the occurrence of hallucinations, and establish a method for validating responses.

References:

*Presented at the 2024 Europe CDISC+TMF Interchange and SAS Innovate 2024, Jasmine Kestemont from Innovion and Stijn Rogiers from argenx presented their groundbreaking work on translating human-readable protocols into machine-readable formats using the Unified Study Data Model (USDM).

*SAS Generative AI explained (GenAI)

Find more articles from SAS Global Enablement and Learning here.

SAS Communities Library