BookmarkSubscribeRSS Feed
Hatsie
Calcite | Level 5

Hi all, I'm hoping you can help me with my data issue. I have data with tens of thousands of rows per participant (repeated measurements

), and I need to convert it into a single row. I would appreciate any suggestions! I have read the information on transposing, but that's not quite what I want to do.

A data file for one participant is in a format like this (but obviously has many more rows):

Time VariableA VariableB VariableC

1        3               2              9

2        8               5             11

3        2               7              5

And I need to convert it to a format like this, retaining the timestamp for each set of measurements as its own variable:

Time1 VariableA1 VariableB1 VariableC1  Time2 VariableA2 VariableB2 VariableC2  Time3 VariableA3 VariableB3 VariableC3

1             3                 2              9             2           8               5               11            3           2                 7            5

I have many files like this, and want to eventually end up with a single file that has one row per participant.

23 REPLIES 23
kewong
Obsidian | Level 7

This SAS macro has helped me achieved what you want to do.  One caveat is to have a unique identifier for each participant and transpose by that variable.

SAS macro %MultiTranspose: Transposing multiple variables in a SAS data set within a single macro ca...

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

My first question would be why?  You will end up with a lot of variables and few rows which probably isn't the best for processing or further work.  What is it your trying to achieve as there may be better ways of getting the outcome.

Hatsie
Calcite | Level 5

Re: kewong. Thank you, I will try this out. This may be a silly question, but where do I put that line of %multtransponse? Does it go within a data statement? And what about the huge amount of code that is at the bottom of the website you linked to... do I need all of that, or just the one line?


Re: RW9, I need to sum the observations across time variables. If it helps, what I'm working with is physical activity data collected over an entire week. Each 15-second interval is its own row. So I need to create a summary variable for each day that adds up all of the physical activity. But also will need to mark and pull out the hours of the day when the person was sleeping (they were still wearing the monitor 24 hrs/day, even when asleep, so there are data from when they are rolling around in bed, etc., which I need to remove).

ballardw
Super User

I think you might be better off creating a group variable from time. Since your "time" is obviously not time but occurrence if you describe the relation to the desired totals needed I'll bet we can do this with your existing data and proc means/summary.

Hatsie
Calcite | Level 5

Ballard, I edited the original post to add one of my actual datasets in case that is helpful to anyone. I think I understand what you are saying about the time variable... if I understand correctly, you are suggesting that we could just enumerate the occurences. Since everyone has 1 week of data from their physical activity sensor ( ~40,320 15-second intervals of data), we could just number those 40,320 occurences, right, rather than labeling them by the actual date/time of the occurence?  The rub is that I need to remove blocks of data... so if someone slept from 11:21 pm - 6:07 a.m. between the first and second day, I need to be able to identify and delete all the rows that pertain to those dates and times.

Hatsie
Calcite | Level 5

So, I ran a single-variable PROC TRANSPOSE, which seems to have worked fine. The log said I have 1 observation and 40,320 variables, so that sounds good to me. But I can't really take a look at the data, because when I run PROC CONTENTS, SAS gets all gummed up and stops responding. Is this normal with huge datasets? How do you work with it if you can't look at it?

RickAster
Obsidian | Level 7

This is the kind of problem you can run into, not with large data sets, but with such a large number of variables. It explains why commenters are so reluctant to create so many variables.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

So your really going to try working with a dataset which has 40320 variables?  Best of luck with that.  Me personally I would get your data looking like (normalised):

Time VariableName  VariableResult

1          A                    3

1          B                    2

1          C                    9

2          A                    8

2          B                    5

2          C                    11

...

Then use grouped functions, e.g for SQL:

proc sql;

     create table SUMS as

     select     distinct TIME,

                    sum(VARIABLE_RESULT) as SUM

     from        DATASET

     group by TIME;

quit;

If you could provide actual test data and required output I could have a look at it.

bootstrap_armada
Calcite | Level 5


I think you probably can achieve the data structure you want by transposing - that's where I'd be focusing my efforts.  You'll need to do multiple transpositions into new datasets - one for each variable or variable, timestamp combination, AND you'll need something as a unique key to merge datasets back together - perhaps a concatenated field.  Once you've transposed all the variables you'll need to join or merge the wide datasets back together to get one dataset - that's where you'll be glad you when through the effort to create a key.  See the link below for step-by-step details.

http://www.ats.ucla.edu/stat/sas/modules/ltow_transpose.htm

RickAster
Obsidian | Level 7

The sample data shows a perfectly normal transpose. Is there something else that makes it different?

Hatsie
Calcite | Level 5

I think that may be my error in understanding the term. Sometimes in Excel I use the transpose feature, which switches columns and rows. So I assumed that proc transpose would reverse columns and rows, not turn multiple rows into a single row...

Julian1
Calcite | Level 5

I actually agree with mr. boostrap.

you can basically use proc transpose, but you need a unique key say "Account" to eventually merge all columns. Change in new variable names can be achieved by a macro fairly easily.

Hatsie
Calcite | Level 5

Thank you everyone who has responded so far. This is my first time trying the forum so I didn't know whether anyone would chime in. I really appreciate this help.

So, in re: Bootstrap_armada, I read the example you sent. That example seems a bit more straightforward because each family has data for the same three years: 96, 97, and 98. So they specify "id=year" and the new columns are labeled accordingly But will that still work for my example, when the values for the "time" variable are different for everyone? (Everyone started wearing their monitor on a different date)?  I need mine not to append the column name with a value, but basically just say "time1 time2 time3 time4" etc. because "time1" is different for every person.

** Edited to add that other than that bit, the page you sent is very helpful and I was able to understand what they're doing... so it seems promising... it's just the "id" piece I'm unsure about.

Julian1
Calcite | Level 5

if i understand correctly, you can sort by time, then use a first. function to create a count, then you have 1,2,3,4. by string function here you go, time1 time2 time3 time4...etc

http://www.ats.ucla.edu/stat/sas/faq/enumerate.htm

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 23 replies
  • 4982 views
  • 1 like
  • 13 in conversation