BookmarkSubscribeRSS Feed
chemicalab
Fluorite | Level 6

Hi all,

I was wondering if there is a procedure in SAS Base in order to identify in a dataset the type of each variable and i dont mean if it is numeric or character, i am looking for a deeper approach like if it is nominal ,ordinal or continuous. Any ideas?

Thnx in advance

Aristos

7 REPLIES 7
UrvishShah
Fluorite | Level 6

Hi,

As per my knowledge in SAS, i do not know whether there is any Proc available or not for this...But the question you asked is about measurement scale, Nominal,Ordinal,etc...these are not the variable type...So in SAS by default, SAS treates the variable either as charcter or as numeric...

Thanks,

Urvish

chemicalab
Fluorite | Level 6

ok , yes you are right so nothing for that ha? cause in EM i think there is so there could be something in base also

Doc_Duke
Rhodochrosite | Level 12

Enterprise Miner takes a guess by looking at the data

"By default, it takes a random sample of 2,000 observations from the

data set of interest, and uses this information to assign a model role and a

measurement level to each variable."

That approach relies on having lots of data and is not generalizable to parts of SAS that also must work with small samples.  It can also be wrong in EM, particularly in determining ordinal scaling, so you still need to know your data.

Doc Muhlbaier

Duke

Haikuo
Onyx | Level 15

If you have BASE, then PROC UNIVARIATE is your best bet, and I suppose it can fit most of your needs. While you have deeper pocket that you have SAS/QC, /ETS or /INSIGHT, then you can also look into: PROC CAPABILITY,  PROC SEVERITY and PROC RELIABILITY.

It is not like Char or Num type of information that you can obtain from Metadata, you will have to do an analysis towards the variables of your interest.

Haikuo

art297
Opal | Level 21

: I, too, have never seen any mechanism for identifying measurement scale outside of EM's metadata screen.

All SAS procs and functions, that I'm aware of, simply distinguish between character and numeric.  And, even with EM, the automatic assignments are only based on number of values not what the data actually represent.

chemicalab
Fluorite | Level 6

Thank you all for the clarifications

Astounding
PROC Star

There is no automatic way.  It takes defining your criteria and checking them.  Even then, the best you can do (usually) is categorical vs. continuous (integer) vs. continuous (noninteger).  Sometimes you might want to distinguish binary from the other possibilities as well, but you may have to check a larger sample to be confident of a variable being binary.

If a variable takes on 20 different values, must it be continuous (not categorical)?  What is the limit?

If a variable takes on noninteger values, must it be continuous?

Any rules you come up with will always have exceptions.  For example, you may get a set of integers that represent percentiles, and have 100 possible values.  It would be good practice to keep lists of variables that you know about:  variables that are always categorical (no matter how many values they take on), and variables that are always continuous (no matter how few values they take on).

The checking is usually done on a sample of observations, but I have typically used thousands rather than hundreds of observations.

Good luck.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2451 views
  • 7 likes
  • 6 in conversation