I have a situation where I have a parent study and a roll-over study. I need to merge any AE records from the parent study that are missing an end date with all AEs in the roll-over study to see if the parent AEs got entered into the roll-over study. I'm having issues due to how data is entered.
I want to merge by SUBJECT, LLT_NAME, and AESTDAT. However, sometimes, LLT_NAME is missing. In those cases, I want to merge on PARAEID (Note: PARAEID is in the roll-over study and matches AESPID in the parent study). I need to find a way to look ahead and see if I should merge by LLT_NAME or PARAEID on an individual record level.
I was doing something like:
data ae_parent;
set ae_parent;
if llt_name^='' then mergevar=strip(llt_name);
else if aespid^=. then mergevar=compress(put(aespid,best.));
run;
data ae_ro;
set ae_ro;
if llt_name^='' then mergevar=strip(llt_name);
else if paraeid^=. then mergevar=compress(put(paraeid,best.));
run;
The downfall is that MERGEVAR will not always properly match. If the parent study has LLT_NAME and the roll-over study does not, then MERGEVAR will be LLT_NAME in the parent study and PARAEID (or blank if PARAEID is also missing) in the roll-over study.
If you look at my full sample code below, you'll see in the final output (AE_MERGED) that there are 6 records with a status of "NOT Copied Over" when only the last 2 of those 6 should have that status because the other 4 records are creating different values for MERGEVAR. So, is there any way that someone can help me find an algorithm for properly matching up based on differing variables?
data ae_parent;
input SUBJECT $11. LLT_NAME $34. AESPID 49-50 AESTDAT date9.;
format AESTDAT date9.;
aestdat=mdy(month(aestdat),day(aestdat),2018); /* AESTDAT reading in as 2020. Forcing to 2018. */
cards;
103-049-001 CRP increased 2 09MAY2018
103-049-001 Bacteria urine identified 7 21JUN2018
103-049-001 Urine WBC increased 8 21JUN2018
103-049-001 Urine epithelial cells increased 6 21JUN2018
103-049-001 Urine leukocyte esterase positive 9 21JUN2018
103-049-003 APTT increased 3 02MAY2018
103-049-003 INR increased 2 02MAY2018
103-049-003 PT increased 1 02MAY2018
103-049-003 Nasal mucus increased 5 19JUN2018
103-049-004 3 02JUL2018
103-049-004 APTT increased 9 17JUL2018
103-049-004 Blood glucose increased 6 17JUL2018
103-049-004 Decreased hemoglobin 10 17JUL2018
103-049-004 Glucose urine increased 11 17JUL2018
103-049-004 INR increased 8 17JUL2018
103-049-004 PT increased 7 17JUL2018
;
run;
data ae_ro;
input SUBJECT $11. LLT_NAME $34. AESPID 49-50 PARAEID 53-54 AESTDAT date9.;
format AESTDAT date9.;
aestdat=mdy(month(aestdat),day(aestdat),2018); /* AESTDAT reading in as 2020. Forcing to 2018. */
cards;
103-049-001 CRP increased 3 2 09MAY2018
103-049-001 5 7 21JUN2018
103-049-001 6 8 21JUN2018
103-049-001 Urine epithelial cells increased 4 6 21JUN2018
103-049-001 Urine leukocyte esterase positive 7 9 21JUN2018
103-049-001 Skin irritation 1 24JUL2018
103-049-001 Skin irritation 2 22AUG2018
103-049-003 4 02MAY2018
103-049-003 INR increased 3 2 02MAY2018
103-049-003 PT increased 2 1 02MAY2018
103-049-003 Nasal mucus increased 5 5 19JUN2018
103-049-003 Upper respiratory infection 1 30SEP2018
103-049-004 Creatine kinase increased 6 3 02JUL2018
103-049-004 APTT increased 10 9 17JUL2018
103-049-004 Blood glucose increased 7 6 17JUL2018
103-049-004 INR increased 9 8 17JUL2018
103-049-004 PT increased 8 7 17JUL2018
103-049-004 Ear pain 1 29JUL2018
103-049-004 Creatinine increased 2 31JUL2018
103-049-004 AST increased 3 11SEP2018
103-049-004 Creatine kinase increased 4 11SEP2018
103-049-004 Upper respiratory infection 5 06OCT2018
;
run;
data ae_parent;
set ae_parent;
if llt_name^='' then mergevar=strip(llt_name);
else if aespid^=. then mergevar=compress(put(aespid,best.));
run;
data ae_ro;
set ae_ro;
if llt_name^='' then mergevar=strip(llt_name);
else if paraeid^=. then mergevar=compress(put(paraeid,best.));
run;
proc sort data=ae_parent;
by subject mergevar aestdat;
proc sort data=ae_ro;
by subject mergevar aestdat;
run;
data ae_merged;
merge ae_parent (in=a) ae_ro (in=b);
by subject mergevar aestdat;
if a;
if b=0 then STATUS='NOT Copied Over';
if b=1 then STATUS='Copied Over';
run;
... View more