BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sas1018
Fluorite | Level 6
Hi can anybody please explain me about the data mining for data review? I am preparing for an interview where they are expecting the knowledge on data mining for Sr Clinical SAS programmer role. TIA.
1 ACCEPTED SOLUTION

Accepted Solutions
MarkLambrecht
SAS Employee

Hi,

AI, analytics and data mining is used in risk-based quality management during clinical trials. A good introduction to the topic is given here : 

https://www.amazon.com/Risk-Based-Monitoring-Detection-Clinical-Trials/dp/1612909914

 

Pharmaceutical companies create a series of risk indicators that will ensure timely monitoring of data quality - this could be as simple as a lab instrument that is not calibrated or complex interactions between clinical sites and the recruited patients. The goal is to ensure that the data integrity is being maintained for submission, safety and overall quality purposes.

 

This procedure is in line with the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. Integrated addendum to ICH E6(R1): guideline for good clinical practice E6(R2). Available at: https://database.ich.org/sites/default/files/E6_R2_Addendum.pdf.

 

Best regards,

Mark

View solution in original post

5 REPLIES 5
sbxkoenk
SAS Super FREQ

Hello,

 

Your question is a bit too concise for a meaningful answer.

For example Google returns:

<< No results found for "data mining for data review" >>

when querying for "data mining for data review" (with quotes!).

 

Of course before doing any data mining a good data profiling is necessary.

Many DM / ML algorithms also require suitable variable transformations as they cannot deal very well with things like extreme skewness, an excess of zeros (or one very high peak in the distribution), rare 'codes' for a categorical variable, very high dimensionality (well, most algo's can deal with the latter but dimension reduction is always a good idea) and so on. ...

Also, many DM / ML algorithms are misled by cell-wise or case-wise outliers. So univariate and multivariate upfront outlier detection and removal might be needed. Or outliers can be 'smoothed' (like 'winsorized' or something similar) .

Important in data mining is also population stability over time! You want the training data to be representative for all future scoring data. Stability monitoring is therefore appropriate. Re-training might be necessary if the target population shifts.

 

I could go on endlessly, but in order not to do any further unnecessary effort (I may be completely off the mark), it would be appropriate for you to refine your initial question a little.

 

Also, consider posting this question on the 'Analytics > SAS Data Mining and Machine Learning' board.

You might get more and better answers there (after refining the question from your side).

 

Have a nice weekend,

Koen

 

sas1018
Fluorite | Level 6

 Thanks for the reply koen. I don't know anything about data mining. I just want to know the meaning of data mining and what is the purpose and where do we use data mining in clinical trial data (just to get the basic idea). I have seen some common data mining techniques when I google it, but didn't understand how can I relate that to clinical trial data.

sbxkoenk
SAS Super FREQ

Hello,

 

I have used data mining extensively in about every sector, but not yet for clinical trials data. To me, clinical trials data are to be analyzed with SAS/STAT procedures in the first place (proc mixed, proc glimmix, proc genmod, ...).

I thus would have to look for use cases of "data mining for clinical trials" data myself.

I leave it up to other people to chime in on this topic.

 

Next time you have a question, I would give it a more telling title. Something like "How can data mining serve clinical trial data analysis"? Instead of just "data mining". 

 

Good luck and I will also follow this thread with great interest. To learn about use cases for data mining in a clinical trial setting.

 

Koen 

MarkLambrecht
SAS Employee

Hi,

AI, analytics and data mining is used in risk-based quality management during clinical trials. A good introduction to the topic is given here : 

https://www.amazon.com/Risk-Based-Monitoring-Detection-Clinical-Trials/dp/1612909914

 

Pharmaceutical companies create a series of risk indicators that will ensure timely monitoring of data quality - this could be as simple as a lab instrument that is not calibrated or complex interactions between clinical sites and the recruited patients. The goal is to ensure that the data integrity is being maintained for submission, safety and overall quality purposes.

 

This procedure is in line with the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. Integrated addendum to ICH E6(R1): guideline for good clinical practice E6(R2). Available at: https://database.ich.org/sites/default/files/E6_R2_Addendum.pdf.

 

Best regards,

Mark

sas1018
Fluorite | Level 6

 

Thank you Mark and Koen. I found the below article too which gave me some more idea about data mining on clinical data.

 

Data Mining in Clinical Data Sets: A Review (ijais.org)

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 26867 views
  • 1 like
  • 3 in conversation