Contributor
Posts: 47

# Use a categorical variable to split a numeric variables into intervals

I would like to perform an ANOVA analysis with a blocking variable.

My dependent variable would be Intensity (interval)

My independent variable is Name (A1 through A16)

And my blocking variable would be mass (interval). In order to use mass as a blocking factor, I would have to transform it into a categorical variable. Basically, I would like to do something similar to an Interactive Grouping (EM) for Mass in Base SAS depending on the values for Name.

I need to know which mass intervals are associated with which names.

Any ideas?

Thanks

Posts: 1,848

## Re: Use a categorical variable to split a numeric variables into intervals

You can create the categorical variable as in next demo:

proc format lib=work;

value ctagx

low - -10 = "LT -10"

-10 -  0   = "-10 - 0"

0   - 100 = "0 - 100"

100 - high = " GT 100"

; run;

You can define your own ranges, as much as you need, and the labels (righ side) you prefer.

The LOW - / - HIGH rows are optional.

After creating the format you need create your categories, as in:

data want;

set have;

category = put(num_var , catgx.);

... other code as needed ...

run;

Contributor
Posts: 47

## Re: Use a categorical variable to split a numeric variables into intervals

Hello, I appreciate you answering me, but this isn't what I'm looking for. Sure, I want to split Mass into ranges, but how do I determine those ranges? I want to split Mass in such a way that each mass range gets associated with each Name level. So, let's say Mass ranges from 1 to 1600. I want to be able to say mass1 (1 - 100) is associated with Name=A1, mass2 (201-300) is associated with Name=A2, and so on... that makes me think, maybe I could pull this off by using a decision tree, where Name is the target and Mass is the feature...
Posts: 1,848

## Re: Use a categorical variable to split a numeric variables into intervals

I hope one of the next two fits to what you want:

either - assigning a name to a range:

proc format lib=work;

value ctagx

low - -10 = "A1"

-10 -  0   = "A2"

0   - 100 = "A3"

100 - high = "A4"

; run;

OR - assigning a range to a name you already have:

proc format lib=work;

value ctagx

"A1" = "LT -10"

"A2"  = "-10 - 0"

"A3" = "0 - 100"

"A4" = " GT 100"

; run;

in this last case change the code in the data step:

category = put(num_var , \$catgx.);

Discussion stats
• 3 replies
• 373 views
• 0 likes
• 2 in conversation