BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
amoney518
Fluorite | Level 6

Hello all, 

 

I'm working on a project using NHANES and trying to merge datasets. One dataset (MEDS) lists medications patients are taking as individual observations, all with a common variable SEQN that is a unique identifier for individual patients.

In order to merge the datasets I must convert this drug-level data to patient-level data using the SEQN variable in a format that retains  the variables RXDDRGID, RXDDRUG, and  RXDUSE from the MEDS dataset without having redundant observations in my combined dataset merged by SEQN.

I encountered this document which appears to have the right information but frankly it's way over my head.

 

So far, I have read that the proc summary procedure may be the best way and this is what I have:

 

proc summary nway data=NHANES.MEDS;
class SEQN RXDDRGID RXDDRUG RXDUSE;
id SEQN RXDDRGID RXDDRUG RXDUSE;
output out=NHANES.FIXEDMEDS;
run;

 

but this is returning a dataset with the right variables but too many observations.

 

I am very new to SAS so any help would be greatly appreciated!

1 ACCEPTED SOLUTION
4 REPLIES 4
Reeza
Super User
Do you need to summarize data before you change the structure? If so, you likely need PROC SUMMARY. If not, you likely want PROC TRANSPOSE. https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/
amoney518
Fluorite | Level 6

Thanks for your help! This is definitely what I was looking for. 

However, I'm still having trouble getting the right number of observations in my output datasets. I've made new code below that sorts the original dataset by the variables of interest (since each observation doesn't have a unique identifier), transposes the three variables of interest individually, then merges them by SEQN but this doesn't seem to be working as anticipated. 

Any ideas?

 

proc sort data=NHANES.MEDS out=NHANES.SORTEDMEDS;
by SEQN RXDDRGID;
run;

 

proc transpose data=NHANES.SORTEDMEDS out=NHANES.RXDDRGIDMEDS prefix=RXDDRGID;
by SEQN RXDDRGID;
var RXDDRGID;
run;


proc transpose data=NHANES.SORTEDMEDS out=NHANES.RXDDRUGMEDS prefix=RXDDRUG;
by SEQN RXDDRGID;
var RXDDRUG;
run;


proc transpose data=NHANES.SORTEDMEDS out=NHANES.RXDUSEMEDS prefix=RXDUSE;
by SEQN RXDDRGID;
var RXDUSE;
run;


data NHANES.FIXEDMEDS;
merge NHANES.RXDDRGIDMEDS (drop=_name_) NHANES.RXDDRUGMEDS (drop =_name_) NHANES.RXDUSEMEDS (drop=_name_);
by SEQN ;
run;

amoney518
Fluorite | Level 6

Actually, I seem to have solved by the problem by limiting the BY statement to only SEQN within the PROC TRANSPOSE procedures. 

 

Thanks!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1695 views
  • 2 likes
  • 2 in conversation