Hello,
I would like to convert the following character values into numeric.
My values are in character;
Interst
Freeway
Pri Art
Min Art
Maj Col
Min Col
Local
First I want to convert to numeric then convert to 1,2,3 in the following order.
'Interst' = 1
'Freeway' = 2
'Pri Art' = 3
'Min Art' = 4
'Maj Col' = 5
'Min Col' = 6
'Local' = 7
I initially created a format and i ran an input statement in the data step but I didn't get the expected result. My results were blank for funtional_num. Please help!!
proc format;
value $functional 'Interst' = 1
'Freeway' = 2
'Pri Art' = 3
'Min Art' = 4
'Maj Col' = 5
'Min Col' = 6
'Local' = 7
;
run;
data A;
length functional $10.;
input functional$ 1-10;
datalines;
Interst
Freeway
Pri Art
Min Art
Maj Col
Min Col
Local
; run;
data B;
set A;
functional_num = input(functional, 8.);
format functional_num functional.;
run;
Result obtained:
Obs functional functional_num
1 Interst .
2 Freeway .
3 Pri Art .
4 Min Art .
5 Maj Col .
6 Min Col .
7 Local .
Its not clear what you are trying to do here. Normally formats are used to make the data "more readable", converting numbers (such as 1, 2, 3, ...) into something more meaningful to the audience or SAS programmer, such as "Freeway", "Pri Art", ...
You may have a valid reason why you would want to use the reverse of what I just described and disguise "Freeway" as the number 2, but I am skeptical this is a good thing to do. (There are cases where you are working with somewhat sensitive data, such as ethnicity, and you might want to disguise "Hispanic" as the number 2, but this doesn't seem to apply here).
Nevertheless, this works:
data B;
set A;
functional_num = functional;
format functional_num $functional.;
run;
Note: Functional_num is still a character variable
If FUNCTIONAL_NUM really has to be numeric, then you need an INFORMAT to be used in the INPUT function, not a format.
proc format;
invalue functional 'Interst' = 1
'Freeway' = 2
'Pri Art' = 3
'Min Art' = 4
'Maj Col' = 5
'Min Col' = 6
'Local' = 7
;
run;
data A;
length functional $10.;
input functional$ 1-10;
datalines;
Interst
Freeway
Pri Art
Min Art
Maj Col
Min Col
Local
;
data B;
set A;
functional_num = input(functional,functional.);
run;
Its not clear what you are trying to do here. Normally formats are used to make the data "more readable", converting numbers (such as 1, 2, 3, ...) into something more meaningful to the audience or SAS programmer, such as "Freeway", "Pri Art", ...
You may have a valid reason why you would want to use the reverse of what I just described and disguise "Freeway" as the number 2, but I am skeptical this is a good thing to do. (There are cases where you are working with somewhat sensitive data, such as ethnicity, and you might want to disguise "Hispanic" as the number 2, but this doesn't seem to apply here).
Nevertheless, this works:
data B;
set A;
functional_num = functional;
format functional_num $functional.;
run;
Note: Functional_num is still a character variable
If FUNCTIONAL_NUM really has to be numeric, then you need an INFORMAT to be used in the INPUT function, not a format.
proc format;
invalue functional 'Interst' = 1
'Freeway' = 2
'Pri Art' = 3
'Min Art' = 4
'Maj Col' = 5
'Min Col' = 6
'Local' = 7
;
run;
data A;
length functional $10.;
input functional$ 1-10;
datalines;
Interst
Freeway
Pri Art
Min Art
Maj Col
Min Col
Local
;
data B;
set A;
functional_num = input(functional,functional.);
run;
@PaigeMiller You are right about keeping the initial meaning names e.g local, minor collector etc. It is more meaningful. So I want to have the functional road class in this order:
Interstate
Freeway
Pri Art
Min Art
Maj Col
Min Col
Local.
So when I run a proc freq, I want it to be in this order. How do I do that ? Thanks
PROC FREQ has an ORDER= option
If one of those doesn't work for you, then I would create the numeric value using the INFORMAT as I showed earlier, sort by this numeric value, and then run PROC FREQ on your character variable named FUNCTIONAL (not the numeric variable named FUNCTIONAL_NUM), using the ORDER=DATA option. The sorting causes Interstate to appear first, Freeway will appear second, and so on.
Side comment: ask the real question in your first post, in this case the real question is how to get PROC FREQ to put categories in a specific order; don't ask an indirect question about how to assign numbers to character string categories. Your original question is the XY Problem, and should be avoided.
@PaigeMiller Thanks for the advice. Well my main goal hasn't been achieved yet. My goal is to change the order of functional in my dataset so that when I run a logistic model or an emperical logit plot I will get the result in the order I want. So I initially thought of changing the variable to numeric and then running the model on the numeric version. However, I wanted the variables in their original names just as you suggested.
The freq is just for me to check if the order changed. The method you proposed in your latest post did change the order in the numeric but it didn't change it in the logistic model. Also, I have other variables in my model so functional is not the only one. Looking forward to your response. Thank you
In summary this is what I have in my dataset, in this order but this is what I want in my dataset.
What I have in the dataset:
FUNCTIONAL:
Freeway
Interst
Local
Maj Col
Min Art
Min Col
Pri Art
What I want in my dataset
FUNCTIONAL
Interst
Freeway
Pri Art
Min Art
Maj Col
Min Col
Local
So that is a different question.
How does the order the data sorts impact the result of the logistic regression? The only thing I can think of is that it could change which value it considers as the comparison group. Can't you just specify which value to use?
What procedure are you using to run logistic regression?
When you run a logistic regression, you can specify which level of a variable is the control. In addition, PROC LOGISTIC has the same ORDER= option that PROC FREQ has.
Is there a way to achieve this by not using the input function?
@Schalk_Burger wrote:
Is there a way to achieve this by not using the input function?
You should start your own thread an mention this one as related.
Likely a number of choices available:
In a data step use a bunch of If/then/else statements to assign values conditionally. Or Select/when.
Or use a custom informat when reading the data in the first place.
Actual example of values you have and the desired numeric values, and perhaps a description of why you need the numeric values, may get a more concrete answer.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.