BookmarkSubscribeRSS Feed
hanner1985
Calcite | Level 5

I am trying to write code to score the Global Physical Activity Questionnaire from the NHANES 2017-2018 cycle. My end goal is to determine if vitamin D level is a risk factor for depression and if physical activity has a modifying effect on depression with vitamin D is low. I am using the PAQ_J dataset to determine the score. I have been referring to this community post to inform my code: How to code Physical Activity in NHANES - SAS Support Communities. When I run my own code, I am not getting any people with 0=no physical activity and I know that cannot be correct. What am I doing wrong?

 

Here is my code:

Data project2;
set project2;
totPA=sum((PAQ610*PAD615*8),(PAQ625*PAD630*4),(PAQ640*PAD645*4),(PAQ655*PAD660*8),(PAQ670*PAD675*4))/60;
label totPA='Physical activity total (MET-h/wk)';
if totPA=0 then PAintensity=0;
else if 1<=totPA<=48 then PAintensity=1;
else if totPA>48 then PAintensity=2;
else PAintensity=.;
run;

I will attach my dataset to this post as well.  I am also not sure if I am doing enough to get true values of PA, instead of missing values. 

 

This is my log:

2008  Data project2;
2009  set project2;
2010  totPA=sum((PAQ610*PAD615*8),(PAQ625*PAD630*4),(PAQ640*PAD645*4),(PAQ655*PAD660*8),(PAQ670*PAD675*4))/60;
2011  label totPA='Physical activity total (MET-h/wk)';
2012  if totPA=0 then PAintensity=0;
2013  else if 1<=totPA<=48 then PAintensity=1;
2014  else if totPA>48 then PAintensity=2;
2015  else PAintensity=.;
2016  run;

NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      1497 at 2010:7     4256 at 2010:18    4256 at 2010:25    3266 at 2010:36    3266 at 2010:43    4253 at 2010:54
      4253 at 2010:61    4279 at 2010:72    4279 at 2010:79    3383 at 2010:90    3383 at 2010:97    1497 at 2010:101
NOTE: There were 5569 observations read from the data set WORK.PROJECT2.
NOTE: The data set WORK.PROJECT2 has 5569 observations and 28 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.01 seconds

Please help!

1 REPLY 1
mkeintz
PROC Star

Thank you for showing your log, which may be answering your question.  In particular, it includes the note

 

NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      1497 at 2010:7     4256 at 2010:18    4256 at 2010:25    3266 at 2010:36    3266 at 2010:43    4253 at 2010:54
      4253 at 2010:61    4279 at 2010:72    4279 at 2010:79    3383 at 2010:90    3383 at 2010:97    1497 at 2010:101

which says that at least 4,279 of your 5,569 observations have a missing value in columns 72 and 79.  Using the statement copied below, I suspect that would be variables PAQ655 and/or PAD660.

 

2010  totPA=sum((PAQ610*PAD615*8),(PAQ625*PAD630*4),(PAQ640*PAD645*4),(PAQ655*PAD660*8),(PAQ670*PAD675*4))/60;

which means in turn that totPA will get assigned a missing value.

 

The raises the question of whether you should treat missing values for any of the PAQ variables in line 2010 as zeroes.   After all, totPA can't be zero unless every one of the PAQ variables in the calculation are at zero.  Do those variables ever have a zero value?

 

If not, perhaps you want to treat missing values as zero (look at the NHANES docs to see whether this is justified).  If so, you could do something like

totPA=sum(  (sum(0,PAQ610)*sum(0,PAD615)*8)
         ,  (sum(0,PAQ625)*sum(0,PAD630)*4)
         ,  (sum(0,PAQ640)*sum(0,PAD645)*4)
         ,  (sum(0,PAQ655)*sum(0,PAD660)*8)
         ,  (sum(0,PAQ670)*sum(0,PAD675)*4)
         )
        /60 ;

But I repeat, make sure, for your purposes, that it is valid to treating missing values as zeroes.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 360 views
  • 0 likes
  • 2 in conversation