BookmarkSubscribeRSS Feed
PharmlyDoc
Quartz | Level 8

I have a table with 7 columns, 5 of which have missing values.

 

Taking an example from the code below in table HAVE; row (PAT_ID) 4 has missing values for AGE, INCOME_BLOCK, UNEMPLOY_BLOCK, and HOMEOWNER_BLOCK  – since at least one of these four columns (INCOME_BLOCK, EDU_BLOCK, UNEMPLOY_BLOCK, and HOMEOWNER_BLOCK) have a missing value, I want  all values for these four columns to become missing.

 

data HAVE; 
input PAT_ID : best2. SEX :$6. AGE : best2. INCOME_BLOCK : best6. EDU_BLOCK : best4. UNEMPLOY_BLOCK : best4. HOMEOWNER_BLOCK : best4. ;
datalines;
1 Female  22 45810 93.82 13.07 9.32
2 Male   37  35264 54.94 27.81 4.76
3 Male 27    35264 54.94 27.81 . 
4 Male .      .    18.62  .    .
5 Female 27 42346  18.62 1.87 1.29
6 Male   28 42346  41.11 4.76 10.29
7 Female  23 250001 41.11 3.02 1.29
8 Female  21 115290 93.82 27.81 1.87
9 Female  43 215493 21.65 10.29 3.60
10 Male	 35 215493    .  27.81 3.02
11 Male	 .  215493 15.61 27.81 3.02
12 Male	 21 77521  18.62 7.17  9.32
13 Male	 21 215493 54.94 5.95  27.81
14 Male	 28 115290 34.35    .    .
15 Male	 34 115290 34.35 1.29  9.32
16 Male	 .  30872  34.35 10.29 7.17
17 Female  36 105039 15.61 4.76 13.07
18 Female 23    .     .     .    .
19 Female 34 38964 9.55	 4.76  2.45
20 Female 26 38964 54.94 10.29 10.29
;



data WANT; 
input PAT_ID : best2. SEX :$6. AGE : best2. INCOME_BLOCK : best6. EDU_BLOCK : best4. UNEMPLOY_BLOCK : best4. HOMEOWNER_BLOCK : best4. ;
datalines;
1 Female  22 45810 93.82 13.07 9.32
2 Male   37  35264 54.94 27.81 4.76
3 Male 27     .     .     .    . 
4 Male .      .     .     .    .
5 Female 27 42346  18.62 1.87 1.29
6 Male   28 42346  41.11 4.76 10.29
7 Female  23 250001 41.11 3.02 1.29
8 Female  21 115290 93.82 27.81 1.87
9 Female  43 215493 21.65 10.29 3.60
10 Male	 35     .     .     .    .
11 Male	 .  215493 15.61 27.81 3.02
12 Male	 21 77521  18.62 7.17  9.32
13 Male	 21 215493 54.94 5.95  27.81
14 Male	 28     .     .     .    .
15 Male	 34 115290 34.35 1.29  9.32
16 Male	 .  30872  34.35 10.29 7.17
17 Female  36 105039 15.61 4.76 13.07
18 Female 23    .     .     .    .
19 Female 34 38964 9.55	 4.76  2.45
20 Female 26 38964 54.94 10.29 10.29
;

 

2 REPLIES 2
ballardw
Super User

If the variables are numeric the easiest is to use the NMISS function to count how many variables have missing values. The use the Call missing function.

if nmiss(AGE, INCOME_BLOCK, UNEMPLOY_BLOCK, HOMEOWNER_BLOCK) > 0 then 
call missing(AGE, INCOME_BLOCK, UNEMPLOY_BLOCK, HOMEOWNER_BLOCK);

Call missing is one of the few routines that will work with mix of numeric and character variables and sets ALL the variables to missing.

 

If you have a mix of numeric and character variables you can either do a bunch of IF OR testing if any variable is missing Or use some code to create a value to examine for a known behavior such as this:

data HAVE; 
input PAT_ID : best2. SEX :$6. AGE : best2. INCOME_BLOCK : best6. EDU_BLOCK : best4. UNEMPLOY_BLOCK : best4. HOMEOWNER_BLOCK : best4. ;
datalines;
1 Female  22 45810 93.82 13.07 9.32
2 Male   37  35264 54.94 27.81 4.76
3 Male 27    35264 54.94 27.81 . 
4 Male .      .    18.62  .    .
5 Female 27 42346  18.62 1.87 1.29
6 .   28 42346  41.11 4.76 10.29
7 Female  23 250001 41.11 3.02 1.29
8 Female  21 115290 93.82 27.81 1.87
9 Female  43 215493 21.65 10.29 3.60
10 .	 35 215493    .  27.81 3.02
11 Male	 .  215493 15.61 27.81 3.02
12 Male	 21 77521  18.62 7.17  9.32
13 Male	 21 215493 54.94 5.95  27.81
14 Male	 28 115290 34.35    .    .
15 Male	 34 115290 34.35 1.29  9.32
16 Male	 .  30872  34.35 10.29 7.17
17 .  36 105039 15.61 4.76 13.07
18 Female 23    .     .     .    .
19 Female 34 38964 9.55	 4.76  2.45
20 Female 26 38964 54.94 10.29 10.29
;

options missing=' ';
data want;
   set have;
   If countw ( catx('|',sex, age,INCOME_BLOCK , EDU_BLOCK,UNEMPLOY_BLOCK, HOMEOWNER_BLOCK),
              '|' ) < 6 then call missing(sex, age,INCOME_BLOCK , EDU_BLOCK,UNEMPLOY_BLOCK, HOMEOWNER_BLOCK); 
run;
/* reset the option*/
options missing='.';

I made a couple of the Sex values missing.

The Catx function will do a conversion to character for the numeric values to make a character value placing a specified character, in this case the | between values. By setting the Option missing to a space the missing values are skipped (CATX removes trailing spaces, if the only value is a space, then it gets eliminated). 

The Countw function the counts how many values are delimited by the | character. If the number is less than the number of variables (exercise for the interested reader how to eliminate the magic number 6 ) then at least one of the variables was missing and again Call Missing sets them to missing.

 

 

 

 

 

andreas_lds
Jade | Level 19

And for those who don't like writing the same variable-names multiple times, an array comes to the rescue:

 

 

data want;
   set have;
   
   array block income_block edu_block unemploy_block homeowner_block;
   
   if nmiss(of block[*]) then call missing(of block[*]);
run;

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 369 views
  • 0 likes
  • 3 in conversation