BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Help please! I am a novice trying to transpose long/tall to short/wide....I am merging a relational database that came from Microsoft Access. Many of the tables, but not all, have duplicate/multiple observations. How can I easily merge and then transpose all the tables? I just want one StudyID, for example, per subject.

I transposed every single variable of all the tables individually and I wound up with such a ridiculously wide data set (with the correct number of observations, however)....
BUT many of the new variables created in the process of 'transpose' are blank or just repeated values.

Any ideas?
3 REPLIES 3
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Share some of your PROC TRANSPOSE code. And if you can share a data sample of exactly what you have for INPUT layout and how you expect it to be rendered by PROC TRANSPOSE (considering the possible companion statements and/or parameters), that would go a long way with helping you in your quest for more effective "transposed" data.

Scott Barry
SBBWorks, Inc.
Doc_Duke
Rhodochrosite | Level 12
In addition to Steve's request for more detail to your question, I'll make some general observations. I infer you are working with clinical research data.

The request for easy is a nonstarter, it ain't gonna be easy. It's not particularly hard, but I would not call it easy.

Typically clinical research data will have a table with visit information, one row per visit per subject. Each visit will have some attributes (sometimes a hundred). The visits are identified by visit number and visit date. Sometimes there are visits that occur between scheduled visits (a real pain, I usually segregate them out and process them separately). You want to take these rows and line them up by sets of columns on visit number. TRANSPOSE will take the maximum number of visit and line up based on that. So, if you are interested in the first 5 visits, but somebody had 15 visits, a naive transpose will generate 15 times x variables. To make that come together, you need to first subset the data from access to get just the rows and columns of interest.

Similarly, if you have a diagnosis table. Every diagnosis is recorded, but you might only be interested in categories of diagnosis. Transform the dx to a category and transpose based on that variable.

There is some broad processes in common here. In the data warehousing domain, it's called "Extract, Transform, and Load" (ETL). You might get some good ideas on how to handle additional problems in the ETL literature.

Doc Muhlbaier
Duke
Doc_Duke
Rhodochrosite | Level 12
Opps. Sorry, I was referencing Scott's message.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 701 views
  • 0 likes
  • 3 in conversation