BookmarkSubscribeRSS Feed

The Clinical Trials Adventure: From Molecules to Medicines with a Little Help from SAS

Started 3 weeks ago by
Modified 3 weeks ago by
Views 217

 

Imagine a molecule sitting quietly in a lab dish. It's got potential—it could be the next big cure.

 

But before it earns its place on a pharmacy shelf, it has to survive a scientific odyssey. Welcome to the world of clinical trials, where data, science, regulations, and yes, SAS software, come together to determine if a treatment is safe and effective.

 

Let’s take a journey through this process, told through the eyes of not just researchers and programmers, but through the blinking cursor of a SAS program shaping life-saving data, and our hero, Maya, a statistical programmer.

 

This episode is called 'Maya does clinical trials magic.'

 

 

Chapter 1: What Are Clinical Trials?

 

At its core, a clinical trial is a systematic investigation conducted with human volunteers to evaluate the effects, safety, and efficacy of medical interventions—whether drugs, devices, or procedures.

 

Think of it as a highly choreographed scientific play. The actors? Patients, doctors, coordinators. The script? The study protocol. And backstage, hidden from the spotlight, is the data crew—armed with SAS and CDISC standards—making sure every scene is captured perfectly.

 

Relevant link: Basics About Clinical Trials – FDA

 

 

Chapter 2: The Drug Approval Odyssey

 

Before a treatment enters clinical trials, it’s put through preclinical testing. Once ready, researchers file an Investigational New Drug (IND) application to the FDA. If greenlit, the study progresses through four phases.

 

Each phase generates truckloads of data—from adverse events to lab results. And how is this data managed, cleaned, and analyzed?

 

Enter SAS. Like a data wizard, it helps statistical programmers write scripts to:

 

  • Clean messy raw datasets
  • Transform them into regulatory-compliant formats
  • Generate summary reports
  • Perform advanced statistical analysis

 

Without SAS, keeping up with the data deluge would be like trying to bail out a sinking ship with a spoon.

 

 

Chapter 3: Enter CDISC – The Language of Data Standardization

 

When trials generate mountains of data, the Clinical Data Interchange Standards Consortium (CDISC) steps in to bring order to the chaos. It defines standards that ensure data from different trials can be understood and reused.

 

  • SDTM (Study Data Tabulation Model): For organizing collected data.
  • ADaM: For analysis datasets.
  • CDASH: For data collection.

 

Relevant link: CDISC Standards

 

And again, SAS is the tool of choice to implement these standards. With SAS macros, libraries, and tools like PROC SQL, data programmers mold raw data into SDTM-compliant structures.

 

 

Chapter 4: The Statistical Programmer – The Behind-the-Scenes Hero

 

Meet Maya, a statistical programmer. Her workday starts not with coffee, but with a blinking SAS log window.

 

Her tasks include:

 

  • Mapping raw datasets (like "RAW_DEMO") to SDTM domains (like "DM")
  • Creating SDTM datasets using macros and PROC steps
  • Validating datasets against SDTMIG guidelines
  • Generating clinical summary tables and listings using PROC REPORT and ODS

 

She lives and breathes in DATA steps, MERGE statements, and %MACRO calls. And her weapon of choice? SAS.

 

 

Chapter 5: Documents That Drive the Trial

 

Every trial relies on several foundational documents:

 

  • Study Protocol
  • Case Report Form (CRF) or its digital sibling, the eCRF
  • Statistical Analysis Plan (SAP)
  • Annotated CRFs (aCRFs)

 

Each of these shapes how data is collected and analyzed. And all of them must align with SDTM standards—implemented in SAS code that reads something like:

 

sas

 

data dm;

  

   set raw_demo;

 

   STUDYID = "TRIAL001";

 

   USUBJID = catx("-", STUDYID, SUBJID);

 

run;

 

 

Chapter 6: Demystifying SDTM and SDTMIG

 

The SDTM Implementation Guide (SDTMIG) is the programmer’s GPS. It explains how each domain should be structured.

 

  • DM (Demographics): Contains participant details
  • AE (Adverse Events): Captures any side effects
  • LB (Lab Results): Lists lab test results

 

Variables are classified as:

 

  • Identifier (e.g., USUBJID)
  • Topic (e.g., LBTEST)
  • Timing (e.g., AESTDTC)

 

To understand the specs, Maya might write code like:

 

sas

 

if AEDECOD = "HEADACHE" and AESER = "Y" then AETOXGR = 2;

 

 

Relevant SDTM link: SDTM and SDTMIG – CDISC

 

 

Chapter 7: Building Domains from Scratch

 

Maya uses a 5-Step Approach to tackle any domain:

 

  1. Create an empty dataset
  2. Map the SDTM domain variables to the raw data
  3. Create formats
  4. Calculate/derive variables
  5. Create the final domain

 

Let’s look at how she builds the DM domain:

 

sas

 

* Create empty dataset;

 

data dm (keep=STUDYID USUBJID SEX AGE RACE);

 

   set raw_demo;

 

   USUBJID = catx("-", STUDYID, SUBJID);

 

run;

 

She does this for every domain—EX, AE, LB, SUPPDM, and even custom domains like XP (Pain Scores).

 

 

Chapter 8: Custom Domains and Raw Transpositions

 

When trials collect unique data not covered in standard domains, custom ones like XP come to life.

 

Maya gets creative:

 

  • Transposes pain scores using arrays and PROC TRANSPOSE
  • Creates empty XP structure using PROC SQL
  • Maps and validates data before finalizing the domain

 

sas

 

proc sql;

 

   create table xp as

 

   select distinct USUBJID, XPTSTCD, XPTST, XPORRES

 

   from raw_pain;

 

quit;

 

 

Chapter 9: Conformance is Key

 

Before submission, every SDTM domain must pass validation checks using tools like Pinnacle 21. But before even reaching that step, Maya ensures conformance by:

 

  • Using SDTM variable names correctly
  • Following correct formats (ISO 8601 for dates)
  • Ensuring proper linking between domains via RELREC

 

And every one of these checks is handled in—you guessed it—SAS.

 

 

Epilogue: From Code to Cure

 

As the clinical trial wraps, all data is locked, cleaned, transformed, and ready for submission to regulatory agencies like the FDA. Thanks to standards like CDISC and the analytical power of SAS, this data now tells a coherent, compliant, and accurate story of the trial.

 

Maya hits "Submit", and smiles.

 

From a molecule to a medicine, it took scientists, clinicians, patients—and a whole lot of SAS code.

 

 

Useful Links

 

 

Maya's process as described is supported by SAS Programming for Clinical Trials 1: Study Data Tabulation Model (SDTM) course available soon at http://learn.sas.com

 

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
3 weeks ago
Updated by:
Contributors

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started

Article Tags