BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
amoney518
Fluorite | Level 6

Hello all, 

 

I'm working on a project using NHANES and trying to merge datasets. One dataset (MEDS) lists medications patients are taking as individual observations, all with a common variable SEQN that is a unique identifier for individual patients.

In order to merge the datasets I must convert this drug-level data to patient-level data using the SEQN variable in a format that retains  the variables RXDDRGID, RXDDRUG, and  RXDUSE from the MEDS dataset without having redundant observations in my combined dataset merged by SEQN.

I encountered this document which appears to have the right information but frankly it's way over my head.

 

So far, I have read that the proc summary procedure may be the best way and this is what I have:

 

proc summary nway data=NHANES.MEDS;
class SEQN RXDDRGID RXDDRUG RXDUSE;
id SEQN RXDDRGID RXDDRUG RXDUSE;
output out=NHANES.FIXEDMEDS;
run;

 

but this is returning a dataset with the right variables but too many observations.

 

I am very new to SAS so any help would be greatly appreciated!

1 ACCEPTED SOLUTION
4 REPLIES 4
Reeza
Super User
Do you need to summarize data before you change the structure? If so, you likely need PROC SUMMARY. If not, you likely want PROC TRANSPOSE. https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/
amoney518
Fluorite | Level 6

Thanks for your help! This is definitely what I was looking for. 

However, I'm still having trouble getting the right number of observations in my output datasets. I've made new code below that sorts the original dataset by the variables of interest (since each observation doesn't have a unique identifier), transposes the three variables of interest individually, then merges them by SEQN but this doesn't seem to be working as anticipated. 

Any ideas?

 

proc sort data=NHANES.MEDS out=NHANES.SORTEDMEDS;
by SEQN RXDDRGID;
run;

 

proc transpose data=NHANES.SORTEDMEDS out=NHANES.RXDDRGIDMEDS prefix=RXDDRGID;
by SEQN RXDDRGID;
var RXDDRGID;
run;


proc transpose data=NHANES.SORTEDMEDS out=NHANES.RXDDRUGMEDS prefix=RXDDRUG;
by SEQN RXDDRGID;
var RXDDRUG;
run;


proc transpose data=NHANES.SORTEDMEDS out=NHANES.RXDUSEMEDS prefix=RXDUSE;
by SEQN RXDDRGID;
var RXDUSE;
run;


data NHANES.FIXEDMEDS;
merge NHANES.RXDDRGIDMEDS (drop=_name_) NHANES.RXDDRUGMEDS (drop =_name_) NHANES.RXDUSEMEDS (drop=_name_);
by SEQN ;
run;

amoney518
Fluorite | Level 6

Actually, I seem to have solved by the problem by limiting the BY statement to only SEQN within the PROC TRANSPOSE procedures. 

 

Thanks!

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1158 views
  • 2 likes
  • 2 in conversation