I'm using an array to convert character variables to numeric. The code executes w/o errors or warnings, but the notes state " NOTE: Invalid argument to function INPUT at line 80 column 16."
What's wrong w/ this syntax? Thanks for your help.
79 DO i=1 to dim(_char);
80 _num(i) = input(_char(i),6.);
81 END;
DATA conversion_subset;
SET dropping_strings;
Array _char(*) $
_100400
_100500
_100600
_200100
_200200
_200300
_200400
_200401
_400402
_400403
_400601
_400602
_500800
_500900
_501000;
Array _num(*) var1-var15;
DO i=1 to dim(_char);
_num(i) = input(_char(i),6.);
END;
DROP
_100400
_100500
_100600
_200100
_200200
_200300
_200400
_200401
_400402
_400403
_400601
_400602
_500800
_500900
_501000
i;
RENAME
VAR1 = _100400
VAR2 = _100500
VAR3 = _100600
VAR4 = _200100
VAR5 = _200200
VAR6 = _200300
VAR7 = _200400
VAR8 = _200401
VAR9 = _400402
VAR10 = _400403
VAR11 = _400601
VAR12 = _400602
VAR13 = _500800
VAR14 = _500900
VAR15 = _501000;
RUN;
Post some example test data (in the form of a datastep). I would also check the structure of your data, why have 15 variables which seem to get assigned to a _xxxxx variable name? Where does the _xxxxx even come from? I would suggest for ease of doing anything to the data you normlise it, i.e. have a long dataset rather than a wide. Now this code is just a guess:
data conversion_subset (keep=variable char_result result);
set dropping_strings;
length variable char_result $100 result 8;
array var{15};
array lab{15} ("100400","100500","100600","200100","200200","200300","200400","200401","400402","400403","400601","400602","500800","500900","501000");
do i=1 to dim(var);
variable=lab{i};
char_result=var{i};
result=input(var{i},best.);
output;
end;
run;
But what it should do is give you a dataset which looks something like:
VARIABLE CHAR_RESULT RESULT
100400 123 123
100500 abc .
...
You will see its far easier to conver a column of data rather than lots of columns, and you also get the benefit of by group processing. If later on you need a transposed output, then proc transpose at that point. Doing the above will also show you quite clearly where a value has not been converted, and what it contains - see the "abc" and missing result. You can then put data cleaning if statements around the result= step.
The syntax is fine. There may be something wrong with the data. This is saying that one (or more) of the incoming character variables contains text that can't legitimately be converted to numeric.
You can get rid of the message by adding ??:
_num(i) = input(_char(i), ??6.);
That doesn't fix the problem, just covers it up.
As @Astounding says it is likely a data issue.
If I were worried about missing an intended conversion I would do a proc freq on the text variables.
For instance if the data has values that are displayed with accounting rules like (1234) to indicate that the value is negative you may not want that be set to missing as your current data would. Other likely things would be currency symbols or commas as part of the values.
Of course if you have incoming values like NULL or N/A or such and those are the only suspect values then you're golden.
Inspect your data.
The NOTE will also supply the number (_N_) of the current iteration when the transformation error happened, so you know which observation(s) was(were) the culprit.
Post some example test data (in the form of a datastep). I would also check the structure of your data, why have 15 variables which seem to get assigned to a _xxxxx variable name? Where does the _xxxxx even come from? I would suggest for ease of doing anything to the data you normlise it, i.e. have a long dataset rather than a wide. Now this code is just a guess:
data conversion_subset (keep=variable char_result result);
set dropping_strings;
length variable char_result $100 result 8;
array var{15};
array lab{15} ("100400","100500","100600","200100","200200","200300","200400","200401","400402","400403","400601","400602","500800","500900","501000");
do i=1 to dim(var);
variable=lab{i};
char_result=var{i};
result=input(var{i},best.);
output;
end;
run;
But what it should do is give you a dataset which looks something like:
VARIABLE CHAR_RESULT RESULT
100400 123 123
100500 abc .
...
You will see its far easier to conver a column of data rather than lots of columns, and you also get the benefit of by group processing. If later on you need a transposed output, then proc transpose at that point. Doing the above will also show you quite clearly where a value has not been converted, and what it contains - see the "abc" and missing result. You can then put data cleaning if statements around the result= step.
add to the loop
if prxmatch("/(?i)([a-z])/",_char(i))<=0
This will by pass any with alphabetic characters in the string
@timeless wrote:
add to the loop
if prxmatch("/(?i)([a-z])/",_char(i))<=0
This will by pass any with alphabetic characters in the string
Using ?? as part of the input statement like already suggested is much more efficient than using a RegEx. The ?? syntax will also ALWAYS work if an informat doesn't apply to an input value where I believe your RegEx wouldn' capture "invalid" strings with digits and blanks only, i.e. something like "999 999"
For some reason ?? gives me an error
ERROR 22-322: Expecting a format name.
ERROR 200-322: The symbol is not recognized and will be ignored.
@timeless wrote:
For some reason ?? gives me an error
ERROR 22-322: Expecting a format name.
ERROR 200-322: The symbol is not recognized and will be ignored.
You still need to give it a format. Just add the ?? before the format specification.
input(xxx,??6.)
That was exactly what I was doing![]()
Please post the log of the whole step that produces the error.
@timeless wrote:
That was exactly what I was doing
Check your program and SAS log more carefully. The only way to get that message is to not include a format specification. If you include an invalid format specification you get a different error message.
718 data _null_;
719 input string $20.;
720 num1=input(string,20.);
721 num2=input(string,??20.);
722 num3=input(string,??);
-
22
76
723 num4=input(string,);
-
22
76
724 num5=input(string,1234);
----
85
76
ERROR 22-322: Expecting a format name.
ERROR 76-322: Syntax error, statement will be ignored.
ERROR 85-322: Expecting a format name.
725 cards;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.