turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Type of variable identification

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-22-2013 06:46 AM

Hi all,

I was wondering if there is a procedure in SAS Base in order to identify in a dataset the type of each variable and i dont mean if it is numeric or character, i am looking for a deeper approach like if it is nominal ,ordinal or continuous. Any ideas?

Thnx in advance

Aristos

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to chemicalab

02-22-2013 07:53 AM

Hi,

As per my knowledge in SAS, i do not know whether there is any Proc available or not for this...But the question you asked is about measurement scale, Nominal,Ordinal,etc...these are not the variable type...So in SAS by default, SAS treates the variable either as charcter or as numeric...

Thanks,

Urvish

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to UrvishShah

02-22-2013 07:56 AM

ok , yes you are right so nothing for that ha? cause in EM i think there is so there could be something in base also

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to chemicalab

02-22-2013 08:59 AM

Enterprise Miner takes a guess by looking at the data

"By default, it takes a random sample of 2,000 observations from the

data set of interest, and uses this information to assign a model role and a

measurement level to each variable."

That approach relies on having lots of data and is not generalizable to parts of SAS that also must work with small samples. It can also be wrong in EM, particularly in determining ordinal scaling, so you still need to know your data.

Doc Muhlbaier

Duke

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to chemicalab

02-22-2013 09:02 AM

If you have BASE, then PROC UNIVARIATE is your best bet, and I suppose it can fit most of your needs. While you have deeper pocket that you have SAS/QC, /ETS or /INSIGHT, then you can also look into: PROC CAPABILITY, PROC SEVERITY and PROC RELIABILITY.

It is not like Char or Num type of information that you can obtain from Metadata, you will have to do an analysis towards the variables of your interest.

Haikuo

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to chemicalab

02-22-2013 09:06 AM

: I, too, have never seen any mechanism for identifying measurement scale outside of EM's metadata screen.

All SAS procs and functions, that I'm aware of, simply distinguish between character and numeric. And, even with EM, the automatic assignments are only based on number of values not what the data actually represent.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to chemicalab

02-22-2013 09:11 AM

Thank you all for the clarifications

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to chemicalab

02-22-2013 10:28 AM

There is no automatic way. It takes defining your criteria and checking them. Even then, the best you can do (usually) is categorical vs. continuous (integer) vs. continuous (noninteger). Sometimes you might want to distinguish binary from the other possibilities as well, but you may have to check a larger sample to be confident of a variable being binary.

If a variable takes on 20 different values, must it be continuous (not categorical)? What is the limit?

If a variable takes on noninteger values, must it be continuous?

Any rules you come up with will always have exceptions. For example, you may get a set of integers that represent percentiles, and have 100 possible values. It would be good practice to keep lists of variables that you know about: variables that are always categorical (no matter how many values they take on), and variables that are always continuous (no matter how few values they take on).

The checking is usually done on a sample of observations, but I have typically used thousands rather than hundreds of observations.

Good luck.