PharmaSUG 2012 Paper: Creating Analysis Data Marts from SDTM Warehouses

PharmaSUG 2012

Getting clinical trial data into SDTM (Study Data Tabulation Model) domains is only the first hurdle in the race. The next obstacle is being able to combine separate trials’ SDTM domains so that they can be analyzed. On the surface, the task of combining SDTM data sets is straightforward – simply stack each trial’s domain with the corresponding domain from all the other trials. However, as for any other type of data warehouse that is loaded over time, careful attention needs to be given to assure that strict version control is in place for field attributes, controlled terminologies, and thesaurus-based data such as MedDRA.


This paper describes a two-step approach that has been used to build SDTM warehouses and then to extract analysis data marts from them. The first step is to create two comprehensive warehouses: one for ongoing trials and another for completed trials. The second step is to create analysis data marts by extracting from the warehouses the domains for a set of homogeneous trials. Analysts develop the criteria for homogeneity using information in the trials’ Trial domains (TA, TE, TI, TS, & TV). These criteria drive an extract process that creates an analysis data mart of SDTM domains that contain only the records for the trials that meet the analyst’s criteria.

