Hi,
AI, analytics and data mining is used in risk-based quality management during clinical trials. A good introduction to the topic is given here :
https://www.amazon.com/Risk-Based-Monitoring-Detection-Clinical-Trials/dp/1612909914
Pharmaceutical companies create a series of risk indicators that will ensure timely monitoring of data quality - this could be as simple as a lab instrument that is not calibrated or complex interactions between clinical sites and the recruited patients. The goal is to ensure that the data integrity is being maintained for submission, safety and overall quality purposes.
This procedure is in line with the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. Integrated addendum to ICH E6(R1): guideline for good clinical practice E6(R2). Available at: https://database.ich.org/sites/default/files/E6_R2_Addendum.pdf.
Best regards,
Mark
Hello,
Your question is a bit too concise for a meaningful answer.
For example Google returns:
<< No results found for "data mining for data review" >>
when querying for "data mining for data review" (with quotes!).
Of course before doing any data mining a good data profiling is necessary.
Many DM / ML algorithms also require suitable variable transformations as they cannot deal very well with things like extreme skewness, an excess of zeros (or one very high peak in the distribution), rare 'codes' for a categorical variable, very high dimensionality (well, most algo's can deal with the latter but dimension reduction is always a good idea) and so on. ...
Also, many DM / ML algorithms are misled by cell-wise or case-wise outliers. So univariate and multivariate upfront outlier detection and removal might be needed. Or outliers can be 'smoothed' (like 'winsorized' or something similar) .
Important in data mining is also population stability over time! You want the training data to be representative for all future scoring data. Stability monitoring is therefore appropriate. Re-training might be necessary if the target population shifts.
I could go on endlessly, but in order not to do any further unnecessary effort (I may be completely off the mark), it would be appropriate for you to refine your initial question a little.
Also, consider posting this question on the 'Analytics > SAS Data Mining and Machine Learning' board.
You might get more and better answers there (after refining the question from your side).
Have a nice weekend,
Koen
Thanks for the reply koen. I don't know anything about data mining. I just want to know the meaning of data mining and what is the purpose and where do we use data mining in clinical trial data (just to get the basic idea). I have seen some common data mining techniques when I google it, but didn't understand how can I relate that to clinical trial data.
Hello,
I have used data mining extensively in about every sector, but not yet for clinical trials data. To me, clinical trials data are to be analyzed with SAS/STAT procedures in the first place (proc mixed, proc glimmix, proc genmod, ...).
I thus would have to look for use cases of "data mining for clinical trials" data myself.
I leave it up to other people to chime in on this topic.
Next time you have a question, I would give it a more telling title. Something like "How can data mining serve clinical trial data analysis"? Instead of just "data mining".
Good luck and I will also follow this thread with great interest. To learn about use cases for data mining in a clinical trial setting.
Koen
Hi,
AI, analytics and data mining is used in risk-based quality management during clinical trials. A good introduction to the topic is given here :
https://www.amazon.com/Risk-Based-Monitoring-Detection-Clinical-Trials/dp/1612909914
Pharmaceutical companies create a series of risk indicators that will ensure timely monitoring of data quality - this could be as simple as a lab instrument that is not calibrated or complex interactions between clinical sites and the recruited patients. The goal is to ensure that the data integrity is being maintained for submission, safety and overall quality purposes.
This procedure is in line with the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. Integrated addendum to ICH E6(R1): guideline for good clinical practice E6(R2). Available at: https://database.ich.org/sites/default/files/E6_R2_Addendum.pdf.
Best regards,
Mark
Thank you Mark and Koen. I found the below article too which gave me some more idea about data mining on clinical data.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.