BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
kippy_kips
Calcite | Level 5

Hi all,

I'm analyzing NHANES data and have combined day 1 food intake and day 2 food intake into their own observation rows. Now I want to combine those rows into 1 row that shows days 1 and 2 intakes for each participant. For example:

 

SEQDR1T_G_WHOLEDR1T_G_REFINEDDR2T_G_WHOLEDR2T_G_REFINED
15.24.01  
1                            .8                          1.2
2                        1.0                          2.2  
2                          2.5                             0

How would I combine the 1s so all the cells are filled? I also have thousands of SEQ IDs.

I've tried proc means data= test noprint nway;

class SEQ;

var ...;

output out= sums;

run;

 

This led to a min max and std showing up as well, is there a way to remove those?

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
data want;
update test(obs=0) test;
by seq;
run;

Or:

proc means data= test noprint nway;
class SEQ;
var _numeric_;
output out=summarized sum= ;
run;

 

I think either of those approaches should work for you. But for NHANES data are there weights you need to factor in?

 

View solution in original post

3 REPLIES 3
Reeza
Super User
data want;
update test(obs=0) test;
by seq;
run;

Or:

proc means data= test noprint nway;
class SEQ;
var _numeric_;
output out=summarized sum= ;
run;

 

I think either of those approaches should work for you. But for NHANES data are there weights you need to factor in?

 

kippy_kips
Calcite | Level 5

Thank you! The first code you sent kept deleting all observations for some reason but the second one worked! I am just interested in the food groups eaten by participants for right now.

Tom
Super User Tom
Super User

That structure looks it would work well as a transaction file for the UPDATE statement.

data want;
  update have(obs=0) have;
  by seq;
run;

For UPDATE to work the ORIGINAL dataset has to have only one observation per BY group (actually a MAX of one per BY group), but the TRANSACTION dataset can have multiple observations and the transactions will be applied in order.  So the result will have the last non-missing value for each variable.