DATA Step, Macro, Functions and more

What exactly is nlevels?

Reply
Frequent Contributor
Posts: 130

What exactly is nlevels?

Could someone provide a little more color. I know that it is an option in proc freq, but i am not really clear on its usefulness and what exactly it is. Thanks.

Super User
Posts: 23,771

Re: What exactly is nlevels?

Posted in reply to ManitobaMoose

Documentation is pretty thorough here IMO:

 

NLEVELS

displays the "Number of Variable Levels" table, which provides the number of levels for each variable named in the TABLES statements. For more information, see the section Number of Variable Levels Table. PROC FREQ determines the variable levels from the formatted variable values, as described in the section Grouping with Formats.

 

Source

 

Number of Variable Levels Table

If you specify the NLEVELS option in the PROC FREQ statement, PROC FREQ displays the "Number of Variable Levels" table. This table provides the number of levels for all variables named in the TABLES statements. PROC FREQ determines the variable levels from the formatted variable values. For more information, see the section Grouping with Formats. The "Number of Variable Levels" table contains the following information:

  • Variable name

  • Levels, which is the total number of levels of the variable

  • Number of Nonmissing Levels, if there are missing levels for any of the variables

  • Number of Missing Levels, if there are missing levels for any of the variables

Source

 

 

Translation:

It gives you the number of unique levels of a variable, for example in SASHELP.CLASS, the variable SEX has 2 different unique values, while Age has XXX. 

 

 

Super User
Posts: 13,583

Re: What exactly is nlevels?

Posted in reply to ManitobaMoose

One use is to compare expected numbers of responses to a variable. See the following example:

ods select nlevels;
proc freq data=sashelp.class  nlevels;
ods output nlevels=work.classlevels;
run;

 

which creates in the results window

 

Number of Variable
Levels
Variable Levels
Name 19
Sex 2
Age 6
Height 17
Weight 15

 

So if "knew" I have 19 records and expect 19 unique names then I don't have to examine the actual values (somewhat useful for identifying the presence of duplicates when you have 1000's of records). If I had expected two levels of Sex and something like 8 I would suspect something wrong.

This rough information could also be used to select or reduce variables for examination with the full Proc Freq output (run without the ODS Select just preceding).

I often run Proc Freq on data sets as an initial data check. BUT if I have 1000's of records and may have some variables that seldom or don't duplicate such as name, phone number, account number and similar then I can reduce the proc freq output by examining the Nlevels and then run something like

Proc freq data=have(drop=name phone account);

run;

to get tables of the values other than those variables.

 

The output data set created with the ods output statement also has additional information about missing levels. Which if you don't expect any missing values for a variable and have NMissLevels greater than 0 you have at least one missing value. If you use the special missing values .A to .Z in addition to . then the NMissLevels variable will indicate how many have been used. If you don't think you were using special missing and see a value greater than 1 that tells you that assumption was incorrect.

Super User
Posts: 10,784

Re: What exactly is nlevels?

Posted in reply to ManitobaMoose

If you were familiar with SQL. it is the same thing as :

 

proc sql;
select count(distinct sex) as nlevels
 from sashelp.class;
quit;
Ask a Question
Discussion stats
  • 3 replies
  • 81 views
  • 2 likes
  • 4 in conversation