DATA Step, Macro, Functions and more

Need help with creating a new dataset and avoid selecting missing and those coded as 99, 999 from other variables in the dataset

Accepted Solution Solved
Reply
Occasional Contributor lis
Occasional Contributor
Posts: 7
Accepted Solution

Need help with creating a new dataset and avoid selecting missing and those coded as 99, 999 from other variables in the dataset

Hello,

I just start learning SAS and I need to create a temporary dataset (walkbike6) selecting individuals from the variable paq635 =1 (people that report walking/biking) but do not bring the missing of this variable neither the missing or "do not know (99, 7777, 9999, etc) from the other variables in the dataset.  This is my code and it seems to work but I was ask to use WHERE and IF statement and I cannot figure it out how to do it.

I would really appreciate some help.

Lisbeth

DATA WALKBIKE6;

    SET NHANES.PAQ_FF;

where paq635 =1 AND paq635 not= .;

            IF PAQ605 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

               IF PAQ620 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

              IF PAQ650 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

              IF PAQ665 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

            IF PAQ635 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

              IF PAQ610 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

              IF PAQ625 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

              IF PAQ655 IN('2' '7' '9' '77' '99' '7777' '9999' '.')  THEN DELETE;

             IF  PAQ670 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

             IF  PAQ640 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

             IF  PAD615 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

             IF  PAD630 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

             IF  PAD660 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

             IF  PAD675 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

              IF  PAD645 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') THEN DELETE;

RUN;


Accepted Solutions
Solution
‎02-13-2013 10:51 AM
Super User
Posts: 10,526

Re: Need help with creating a new dataset and avoid selecting missing and those coded as 99, 999 from other variables in the dataset

I suggest looking at the valid range of values for some of your variables. Is it possible that the variables where '77' or '99' are not wanted have '2' or '7' as valid answers? And similar for the varaibles using '7777' or '9999' as don't know, could they have 2, 7, 77 or 99 as valid responses. You may want different values in the IN clause for some of the varaibles.

View solution in original post


All Replies
Super User
Posts: 5,260

Re: Need help with creating a new dataset and avoid selecting missing and those coded as 99, 999 from other variables in the dataset

You need to link those clauses together with OR's, and wrap them all with a preceeding NOT:

...

WHERE  NOT(PAQ605 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') OR

               PAQ620 IN ('2' '7' '9' '77' '99' '7777' '9999' '.') OR

          etc...);

Data never sleeps
Solution
‎02-13-2013 10:51 AM
Super User
Posts: 10,526

Re: Need help with creating a new dataset and avoid selecting missing and those coded as 99, 999 from other variables in the dataset

I suggest looking at the valid range of values for some of your variables. Is it possible that the variables where '77' or '99' are not wanted have '2' or '7' as valid answers? And similar for the varaibles using '7777' or '9999' as don't know, could they have 2, 7, 77 or 99 as valid responses. You may want different values in the IN clause for some of the varaibles.

Super User
Posts: 5,086

Re: Need help with creating a new dataset and avoid selecting missing and those coded as 99, 999 from other variables in the dataset

Here are some questions as a starting point.

Are PAQ605, PAQ620, etc. actually character variables or are they numeric?  If they are character, the usual way to refer to a missing value would

be ' ' (not '.')  but if they are numeric it would be advisable to change the data values permanently before doing any other processing.

If PAQ605 is 9999, do you want to eliminate the entire observation even if there are valid values for other variables?

When you select PAQ635=1, you don't have to check for missing values as well.  If the value is 1, it can't be missing.

Start with that much, and your answers will dictate the next steps.

Good luck.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 215 views
  • 3 likes
  • 4 in conversation