Contributor
Posts: 22

# Counting the number of observation

Hello SAS community!

I want to do a simple analysis of one dataset.

Let's say there are 2 variable A and B.

I want to know how many observations not missing variable A or B are in my dataset. Also, I want to know the number of observation missing value for A but not for B and vice versa.

What would be good a way to see this? Thank you very much

Super User
Posts: 6,935

## Re: Counting the number of observation

The coding would be slightly different for numeric vs. character variables.  What are we dealing with here?

Contributor
Posts: 22

## Re: Counting the number of observation

Right now, I am dealing with character variables but it would be nice to know how to deal with numaric variables as well.

Super User
Posts: 24,025

Super User
Posts: 13,942

## Re: Counting the number of observation

If the data is not really complicated or large I tend to start with something like:

Proc freq data=have;

tables a * b /list missing;

run;

Note that the result could be sent to a data set using the OUT= option.

Contributor
Posts: 22

## Re: Counting the number of observation

Thanks for the advice. However, the results were bit different from what I expected.

My dataset has more than 1 million observation and this program lists all list of observation under variable A and B.

What I want is something like this.

Count if A = missing

Count if B = missing

Count if A = missing & B =! missing

Count if B = missing & A =! missing

I would like to see one number for each of these 4 conditions something like 3,482 for the number of A having no observation

and 1,923 for observation missing A but having info about B.

I would appreciate if you can help me with this!

Super User
Posts: 24,025

## Re: Counting the number of observation

Did you try the example code.

Once you've assigned the formats and cross any variables you should get only Missing vs Not Missing.

Contributor
Posts: 22

## Re: Counting the number of observation

Thanks Reeza for sharing the file!

I read the example 2, but I am not quite familiar with SAS, I was not sure how to apply this to my case.

To be bit more specific, two variables of interest are COMPANY_FKEY and BEST_EDGAR_TICKER. In the file you gave me,

the following example is given. The library I am using is cleaning.cleanrestat.

``````proc format;
value \$ missfmt ' '="Missing"
other="Not Missing"
;
value nmissfmt . ="Missing"
other="Not Missing"
;
run;
proc freq data=home.ex2;
tables _numeric_ _character_/missing;
format _numeric_ nmissfmt.
_character_ \$missfmt.;
run;``````

Once again, I am not quite familiar with SAS yet, I was trying to figure out which one is statement and which one is name of the variable.

If you can help me with this one, that will be very kind of you.

Another thing I noticed in the example is that the output only shows the number of observation missing and not missing only for each variable. I want to know the number of observation missing A but not B. It would be also nice if you can help me add that condition.

Thank you!

Super User
Posts: 24,025

## Re: Counting the number of observation

``````proc format;
value \$ missfmt ' '="Missing"
other="Not Missing"
;
value nmissfmt . ="Missing"
other="Not Missing"
;
run;
proc freq data=home.ex2;
tables VARIABLE1*VARIABLE2/missing;
format _numeric_ nmissfmt.
_character_ \$missfmt.;
run;``````

If you want the A*B cross tab, you can use an asterisk between the variables in the TABLE statement.

You can see the example above, where I assume VARIABLE1= <your variable name> and VARIABLE2 = <your variable name>

Super User
Posts: 6,935

## Re: Counting the number of observation

Assuming you create the formats along the lines mentioned in Reeza's reference, the complete program could be:

proc format;

value \$miss ' '='Missing' other='Present';

run;

proc freq data=have;

tables a * b / list missing;

format a b \$miss.;

run;

Contributor
Posts: 22

## Re: Counting the number of observation

Thanks for the response! I have a few additional questions.

The following is the result I got from running the program.

SAS Output

COMPANY_FKEY SIC_CODE_FKEY Frequency Percent Cumulative
Frequency Cumulative
Percent Present Missing Present Present
 52 0.32 52 0.32 16412 99.68 16464 100

I ran the program on a sample dataset and I wasn't sure how to interpret this. It appears that there are 52 firm-year observations that does not have SIC_CODE_FKEY while they have COMPANY_FKEY. However, even when I changed the order of the variable, it did not show the other case which is missing COMPANY_FKEY and having SIC_CODE_FKEY.

Also, is it possible to know all 4 possible scenarios? 1) having both A & B 2) having only A 3) having only B 4) missing both A & B

and list the observation for each case?

Thank you for the help!

Posts: 4,797

## Re: Counting the number of observation

If it's only about two variables then some cross tabulation as already posted should do. If you want to list the missing/non missing combinations of n variables then you could also create a single new variable which contains a 0 for missing an a 1 for a value per source variable and then create a count of this single variable. See sample code below.

``````data have;
set sashelp.cars (keep= Make Model EngineSize Cylinders);
/* create some missing values */
if mod(ceil(ranuni(1)*20),5)=0 then call missing(Make);
if mod(ceil(ranuni(1)*20),5)=0 then call missing(Model);
if mod(ceil(ranuni(1)*20),5)=0 then call missing(EngineSize);
if mod(ceil(ranuni(1)*20),5)=0 then call missing(Cylinders);
run;

data prep(keep=Miss0_NoMiss1);
set have;
length Miss0_NoMiss1 \$ 4;
Miss0_NoMiss1=cats(not missing(Make),not missing(Model),not missing(EngineSize),not missing(Cylinders));
run;

proc freq data=prep;
table Miss0_NoMiss1;
run;
``````
Discussion stats
• 11 replies
• 286 views
• 0 likes
• 5 in conversation