BookmarkSubscribeRSS Feed
rvel
Calcite | Level 5

Hi everyone, 

 

I've got a question about combining and analyzing my data. I will try to explain it as good as possible: 

 

I've got 4000 subjects, and for each subject 200 different variables with food intake. For each variable of food intake, I have three different values, which  I have to (separately) multiply with the variable to create new variables. So at the end, for each subject, I would have 600 more variables? 

 

For example:

                potato

subject 1: 50 

 

Then I have to multiply the 50 g/d potato with 0,55 , 0,46 and 0,83 into three new variables. And repeat this for the 3999 other subjects. However, the next variable, bread needs to be multiplied with three values as well, but these values are different values as mentioned above..  And is there a procedure which I can use to do it faster?

 

At the moment I have a wide data set, but I guess I should make it a long data set. So I would have 3 observations for each subject, but the observations would be the same? 

 

I really hope someone can help me. 

 

 

 

 

 

 

 

2 REPLIES 2
Quentin
Super User

I'm having a hard time envisioning your data.  But my first thought would be to make it into a long narrow dataset with 200 records per subject.  So you would have 4,000 subjects * 200 foods = 800,000 records.

 

It would look like:

Subject Food       Intake   Factor1      Factor2        Factor3
1       Potato     50       .55             .46             .83 
1       Milk       75       .33             .68             .99
1       Honey      25       .02             .34             .33
... 

 

With a structure like that, it should be straight forward to make new variables.  In general, working with long narrow datasets is easier than wide datasets.

 

You could even take it further, and transpose each of the above records into three records (one for each factor).  So your variables would be Subject Food Intake FactorID (1-3) and Factor.  But depending one what you will be doing with these data, that might be overkill.

BASUG is hosting free webinars Next up: Mike Sale presenting Data Warehousing with SAS April 10 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
rvel
Calcite | Level 5
Thank you for your answer! Tomorrow I'll show my data, that would be easier.

Eventually, I should end up with 3 scores for each subject based on their intake (e.g. Potato). So for each subject I have for 200 variables their intake in grams per day. And for each variable I have 3 different values per 100 grams.

These data will be eventually part of my exposure for Cox regression analysis.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 815 views
  • 0 likes
  • 2 in conversation