BookmarkSubscribeRSS Feed
hanner1985
Calcite | Level 5

I am trying to write code to score the Global Physical Activity Questionnaire from the NHANES 2017-2018 cycle. My end goal is to determine if vitamin D level is a risk factor for depression and if physical activity has a modifying effect on depression with vitamin D is low. I am using the PAQ_J dataset to determine the score. I have been referring to this community post to inform my code: How to code Physical Activity in NHANES - SAS Support Communities. When I run my own code, I am not getting any people with 0=no physical activity and I know that cannot be correct. What am I doing wrong?

 

Here is my code:

Data project2;
set project2;
totPA=sum((PAQ610*PAD615*8),(PAQ625*PAD630*4),(PAQ640*PAD645*4),(PAQ655*PAD660*8),(PAQ670*PAD675*4))/60;
label totPA='Physical activity total (MET-h/wk)';
if totPA=0 then PAintensity=0;
else if 1<=totPA<=48 then PAintensity=1;
else if totPA>48 then PAintensity=2;
else PAintensity=.;
run;

I will attach my dataset to this post as well.  I am also not sure if I am doing enough to get true values of PA, instead of missing values. 

 

This is my log:

2008  Data project2;
2009  set project2;
2010  totPA=sum((PAQ610*PAD615*8),(PAQ625*PAD630*4),(PAQ640*PAD645*4),(PAQ655*PAD660*8),(PAQ670*PAD675*4))/60;
2011  label totPA='Physical activity total (MET-h/wk)';
2012  if totPA=0 then PAintensity=0;
2013  else if 1<=totPA<=48 then PAintensity=1;
2014  else if totPA>48 then PAintensity=2;
2015  else PAintensity=.;
2016  run;

NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      1497 at 2010:7     4256 at 2010:18    4256 at 2010:25    3266 at 2010:36    3266 at 2010:43    4253 at 2010:54
      4253 at 2010:61    4279 at 2010:72    4279 at 2010:79    3383 at 2010:90    3383 at 2010:97    1497 at 2010:101
NOTE: There were 5569 observations read from the data set WORK.PROJECT2.
NOTE: The data set WORK.PROJECT2 has 5569 observations and 28 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.01 seconds

Please help!

1 REPLY 1
mkeintz
PROC Star

Thank you for showing your log, which may be answering your question.  In particular, it includes the note

 

NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      1497 at 2010:7     4256 at 2010:18    4256 at 2010:25    3266 at 2010:36    3266 at 2010:43    4253 at 2010:54
      4253 at 2010:61    4279 at 2010:72    4279 at 2010:79    3383 at 2010:90    3383 at 2010:97    1497 at 2010:101

which says that at least 4,279 of your 5,569 observations have a missing value in columns 72 and 79.  Using the statement copied below, I suspect that would be variables PAQ655 and/or PAD660.

 

2010  totPA=sum((PAQ610*PAD615*8),(PAQ625*PAD630*4),(PAQ640*PAD645*4),(PAQ655*PAD660*8),(PAQ670*PAD675*4))/60;

which means in turn that totPA will get assigned a missing value.

 

The raises the question of whether you should treat missing values for any of the PAQ variables in line 2010 as zeroes.   After all, totPA can't be zero unless every one of the PAQ variables in the calculation are at zero.  Do those variables ever have a zero value?

 

If not, perhaps you want to treat missing values as zero (look at the NHANES docs to see whether this is justified).  If so, you could do something like

totPA=sum(  (sum(0,PAQ610)*sum(0,PAD615)*8)
         ,  (sum(0,PAQ625)*sum(0,PAD630)*4)
         ,  (sum(0,PAQ640)*sum(0,PAD645)*4)
         ,  (sum(0,PAQ655)*sum(0,PAD660)*8)
         ,  (sum(0,PAQ670)*sum(0,PAD675)*4)
         )
        /60 ;

But I repeat, make sure, for your purposes, that it is valid to treating missing values as zeroes.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 440 views
  • 0 likes
  • 2 in conversation