Hello all,
I'm working on a project using NHANES and trying to merge datasets. One dataset (MEDS) lists medications patients are taking as individual observations, all with a common variable SEQN that is a unique identifier for individual patients.
In order to merge the datasets I must convert this drug-level data to patient-level data using the SEQN variable in a format that retains the variables RXDDRGID, RXDDRUG, and RXDUSE from the MEDS dataset without having redundant observations in my combined dataset merged by SEQN.
I encountered this document which appears to have the right information but frankly it's way over my head.
So far, I have read that the proc summary procedure may be the best way and this is what I have:
proc summary nway data=NHANES.MEDS;
class SEQN RXDDRGID RXDDRUG RXDUSE;
id SEQN RXDDRGID RXDDRUG RXDUSE;
output out=NHANES.FIXEDMEDS;
run;
but this is returning a dataset with the right variables but too many observations.
I am very new to SAS so any help would be greatly appreciated!
Thanks for your help! This is definitely what I was looking for.
However, I'm still having trouble getting the right number of observations in my output datasets. I've made new code below that sorts the original dataset by the variables of interest (since each observation doesn't have a unique identifier), transposes the three variables of interest individually, then merges them by SEQN but this doesn't seem to be working as anticipated.
Any ideas?
proc sort data=NHANES.MEDS out=NHANES.SORTEDMEDS;
by SEQN RXDDRGID;
run;
proc transpose data=NHANES.SORTEDMEDS out=NHANES.RXDDRGIDMEDS prefix=RXDDRGID;
by SEQN RXDDRGID;
var RXDDRGID;
run;
proc transpose data=NHANES.SORTEDMEDS out=NHANES.RXDDRUGMEDS prefix=RXDDRUG;
by SEQN RXDDRGID;
var RXDDRUG;
run;
proc transpose data=NHANES.SORTEDMEDS out=NHANES.RXDUSEMEDS prefix=RXDUSE;
by SEQN RXDDRGID;
var RXDUSE;
run;
data NHANES.FIXEDMEDS;
merge NHANES.RXDDRGIDMEDS (drop=_name_) NHANES.RXDDRUGMEDS (drop =_name_) NHANES.RXDUSEMEDS (drop=_name_);
by SEQN ;
run;
Actually, I seem to have solved by the problem by limiting the BY statement to only SEQN within the PROC TRANSPOSE procedures.
Thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.