Calcite | Level 5

## Converting Character variables to Numeric

Hello all,

I am having some trouble converting my character variables to numerics. I have about 13 variables to convert so I was trying to use an array.

I currently have this array set up:

Array X[13] \$ zipcode survdate q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 COVIDscore;
Array Y[13] Newzipcode Newsurvdate Newq1 Newq2 Newq3 Newq4 Newq5 Newq6 Newq7 Newq8 Newq9 Newq10 NewCOVIDscore;
Do I=1 to 13;
Y(I) = input(X(I),best32.);
End;
Drop I

But I keep getting warning statements in my log:

WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.

NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to
missing values.
Each place is given by: (Number of times) at (Line):(Column).
4052 at 686:18

686 Y(I) = input(X(I),best32.);

1 ACCEPTED SOLUTION

Accepted Solutions
Super User

## Re: Converting Character variables to Numeric

Let's look at some details of your existing data:

```zipcode=92365-4856 q1=1 q2=1 q3=1 q4=1 q5=1 q6=1 q7=0 q8=0 q9=0 q10=0 COVIDscore=86
survdate=11/02/2020 ```

Zipcode = 92365-4856   is not a number. So attempting to convert it using any of the numeric informats fails. The - either has to precede digits to indicate negative values, -12345 OR in scientific notation to precede the exponent to indicate decimal places  1.5E-5 for example.

f you want Zipcode to be numeric you need to make a decision of what you expect for a numeric value and write specific code for that. Possibly : newzip = input(scan(zipcode,1,'-'),f5.) would be one way to create a five-digit numeric zip.

Similarly  survdate is not a simple number because of the / characters and can't be dealt with the same numeric informat. That would take a date informat such as mmddyy10. (and should have a date format associated so people can understand what is stored).

I

8 REPLIES 8
Super User

## Re: Converting Character variables to Numeric

You need to look at that messages before that in the log.  Basically it is saying the string does not contain a number.

```28   data test;
29     array X \$32 string;
30     array Y number;
31     input string ;
32     y[1]=input(x[1],32.);
33   cards;

NOTE: Invalid argument to function INPUT at line 32 column 8.
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
35         abc
_I_=. string=abc number=. _ERROR_=1 _N_=2
NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 32:8
NOTE: The data set WORK.TEST has 2 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time           0.01 seconds
cpu time            0.01 seconds

36   ;```

If you don't care about the invalid strings and just want to create missing values without SAS writing notes into the SAS log you can use the ? or ?? modifier in front of the informat in the

```37   data test;
38     array X \$32 string;
39     array Y number;
40     input string ;
41     y[1]=input(x[1],?32.);
42   cards;

RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
44         abc
_I_=. string=abc number=. _ERROR_=1 _N_=2
NOTE: The data set WORK.TEST has 2 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time           0.00 seconds
cpu time            0.00 seconds

45   ;
46   data test;
47     array X \$32 string;
48     array Y number;
49     input string ;
50     y[1]=input(x[1],??32.);
51   cards;

NOTE: The data set WORK.TEST has 2 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time           0.00 seconds
cpu time            0.00 seconds

54   ;```

Note there is not a separate informat named BEST. If you ask for it you just get the normal numeric informat.

Calcite | Level 5

## Re: Converting Character variables to Numeric

The log messages above what I initially posted are:

id=3276 city=Otside La County age=50-59 gender= zipcode=92365-4856 q1=1 q2=1 q3=1 q4=1 q5=1 q6=1 q7=0 q8=0 q9=0 q10=0 COVIDscore=86
survdate=11/02/2020 Newgender= Newcity=Outside of Los Angeles Newq1=. Newq2=. Newq3=. Newq4=. Newq5=. Newq6=. Newq7=. Newq8=.
Newq9=. Newq10=. NewCOVIDscore=. Newsurvdate=. I=14 var1=. var2=. var3=1 var4=1 var5=1 var6=1 var7=1 var8=1 var9=0 var10=0 var11=0
var12=0 var13=86 _ERROR_=1 _N_=19
NOTE: Invalid argument to function INPUT at line 686 column 18.
NOTE: Invalid argument to function INPUT at line 686 column 18.

It does that a bunch of times. Do you think its because the format of my zipcode and survdate? (zipcode in the dataset is 9 digits but I was instructed to change it to 5, and survdate in the dataset is MMDDYY10, but I was instructed to change it to weekday.

Super User

## Re: Converting Character variables to Numeric

Let's look at some details of your existing data:

```zipcode=92365-4856 q1=1 q2=1 q3=1 q4=1 q5=1 q6=1 q7=0 q8=0 q9=0 q10=0 COVIDscore=86
survdate=11/02/2020 ```

Zipcode = 92365-4856   is not a number. So attempting to convert it using any of the numeric informats fails. The - either has to precede digits to indicate negative values, -12345 OR in scientific notation to precede the exponent to indicate decimal places  1.5E-5 for example.

f you want Zipcode to be numeric you need to make a decision of what you expect for a numeric value and write specific code for that. Possibly : newzip = input(scan(zipcode,1,'-'),f5.) would be one way to create a five-digit numeric zip.

Similarly  survdate is not a simple number because of the / characters and can't be dealt with the same numeric informat. That would take a date informat such as mmddyy10. (and should have a date format associated so people can understand what is stored).

I

Calcite | Level 5

## Re: Converting Character variables to Numeric

Ah I see, so for the variables zipcode and survdate I would have to use the input function to convert it to a new variable. I cannot use the array for those 2? But for q1-q10 I can use the array?

Super User

## Re: Converting Character variables to Numeric

@shortyofhb wrote:

Ah I see, so for the variables zipcode and survdate I would have to use the input function to convert it to a new variable. I cannot use the array for those 2? But for q1-q10 I can use the array?

Correct.

SAS is very nice when you have sequentially numbered variables to place in an array. You can use shorthand like: array <name> q1-q10. And for your new variables if you name an array like:

Array newq(10) that will create 10 numeric variables named newq1, newq2, newq3 etc. It is a good idea to understand the individual naming though to get used to thinking of order of definition is important.

Calcite | Level 5

## Re: Converting Character variables to Numeric

Thank you so much for the help!!

So I converted created a new variable for zipcode and survdate with the input function. That clarifies it up alot!

One more quick question (I hope). How would I go about using an array to recode a variable to saying something else? For example, in my Proc Format I have 1="Yes" and 0="No" my instructions are however are to "use an array to recode 0="No" to 2="No."

Would I create an array and have it so when 0 is shown it gets replaced with 2? And then use a format to change 2 to "No"?

## Re: Converting Character variables to Numeric

Maybe this is homework, if it is ignore the following.

There is just one way to solve problems like yours: fix the data import!

Calcite | Level 5

## Re: Converting Character variables to Numeric

It is a homework assignment unfortunately haha, the dataset is meant to be a mess!

Discussion stats
• 8 replies
• 1207 views
• 0 likes
• 4 in conversation