I'm using an array to convert character variables to numeric. The code executes w/o errors or warnings, but the notes state " NOTE: Invalid argument to function INPUT at line 80 column 16."
What's wrong w/ this syntax? Thanks for your help.
79 DO i=1 to dim(_char);
80 _num(i) = input(_char(i),6.);
 81 END;
DATA conversion_subset;
	SET	dropping_strings;
 	Array _char(*) $ 
 	_100400
	_100500
 	_100600
 	_200100
 	_200200
	_200300
	_200400
	_200401
	_400402
	_400403
	_400601
	_400602
	
	_500800
	_500900
	_501000;
	
 	Array _num(*) var1-var15;
  	
  	DO i=1 to dim(_char);
     _num(i) = input(_char(i),6.); 
    END;
     
    DROP 
	_100400
	_100500
 	_100600
 	_200100
 	_200200
	_200300
	_200400
	_200401
	_400402
	_400403
	_400601
	_400602	
	_500800
	_500900
	_501000
	i; 
	
	RENAME
	VAR1	=	_100400
	VAR2	=	_100500
	VAR3	=	_100600
	VAR4	=	_200100
	VAR5	=	_200200
	VAR6	=	_200300
	VAR7	=	_200400
	VAR8	=	_200401
	VAR9	=	_400402
	VAR10	=	_400403
	VAR11	= 	_400601
	VAR12	=	_400602
	VAR13	=	_500800
	VAR14	=	_500900
	VAR15	=	_501000;
RUN;
Post some example test data (in the form of a datastep). I would also check the structure of your data, why have 15 variables which seem to get assigned to a _xxxxx variable name? Where does the _xxxxx even come from? I would suggest for ease of doing anything to the data you normlise it, i.e. have a long dataset rather than a wide. Now this code is just a guess:
data conversion_subset (keep=variable char_result result);
  set dropping_strings;
  length variable char_result $100 result 8;
  array var{15};
  array lab{15} ("100400","100500","100600","200100","200200","200300","200400","200401","400402","400403","400601","400602","500800","500900","501000");
  do i=1 to dim(var);
    variable=lab{i};
    char_result=var{i};
    result=input(var{i},best.);
    output;
  end;
run;
But what it should do is give you a dataset which looks something like:
VARIABLE CHAR_RESULT RESULT
100400 123 123
100500 abc .
...
You will see its far easier to conver a column of data rather than lots of columns, and you also get the benefit of by group processing. If later on you need a transposed output, then proc transpose at that point. Doing the above will also show you quite clearly where a value has not been converted, and what it contains - see the "abc" and missing result. You can then put data cleaning if statements around the result= step.
The syntax is fine. There may be something wrong with the data. This is saying that one (or more) of the incoming character variables contains text that can't legitimately be converted to numeric.
You can get rid of the message by adding ??:
_num(i) = input(_char(i), ??6.);
That doesn't fix the problem, just covers it up.
As @Astounding says it is likely a data issue.
If I were worried about missing an intended conversion I would do a proc freq on the text variables.
For instance if the data has values that are displayed with accounting rules like (1234) to indicate that the value is negative you may not want that be set to missing as your current data would. Other likely things would be currency symbols or commas as part of the values.
Of course if you have incoming values like NULL or N/A or such and those are the only suspect values then you're golden.
Inspect your data.
The NOTE will also supply the number (_N_) of the current iteration when the transformation error happened, so you know which observation(s) was(were) the culprit.
Post some example test data (in the form of a datastep). I would also check the structure of your data, why have 15 variables which seem to get assigned to a _xxxxx variable name? Where does the _xxxxx even come from? I would suggest for ease of doing anything to the data you normlise it, i.e. have a long dataset rather than a wide. Now this code is just a guess:
data conversion_subset (keep=variable char_result result);
  set dropping_strings;
  length variable char_result $100 result 8;
  array var{15};
  array lab{15} ("100400","100500","100600","200100","200200","200300","200400","200401","400402","400403","400601","400602","500800","500900","501000");
  do i=1 to dim(var);
    variable=lab{i};
    char_result=var{i};
    result=input(var{i},best.);
    output;
  end;
run;
But what it should do is give you a dataset which looks something like:
VARIABLE CHAR_RESULT RESULT
100400 123 123
100500 abc .
...
You will see its far easier to conver a column of data rather than lots of columns, and you also get the benefit of by group processing. If later on you need a transposed output, then proc transpose at that point. Doing the above will also show you quite clearly where a value has not been converted, and what it contains - see the "abc" and missing result. You can then put data cleaning if statements around the result= step.
add to the loop
if prxmatch("/(?i)([a-z])/",_char(i))<=0
This will by pass any with alphabetic characters in the string
@timeless wrote:
add to the loop
if prxmatch("/(?i)([a-z])/",_char(i))<=0
This will by pass any with alphabetic characters in the string
Using ?? as part of the input statement like already suggested is much more efficient than using a RegEx. The ?? syntax will also ALWAYS work if an informat doesn't apply to an input value where I believe your RegEx wouldn' capture "invalid" strings with digits and blanks only, i.e. something like "999 999"
For some reason ?? gives me an error
ERROR 22-322: Expecting a format name.
ERROR 200-322: The symbol is not recognized and will be ignored.
@timeless wrote:
For some reason ?? gives me an error
ERROR 22-322: Expecting a format name.
ERROR 200-322: The symbol is not recognized and will be ignored.
You still need to give it a format. Just add the ?? before the format specification.
input(xxx,??6.)That was exactly what I was doing
Please post the log of the whole step that produces the error.
@timeless wrote:
That was exactly what I was doing
Check your program and SAS log more carefully. The only way to get that message is to not include a format specification. If you include an invalid format specification you get a different error message.
718   data _null_;
719     input string $20.;
720     num1=input(string,20.);
721     num2=input(string,??20.);
722     num3=input(string,??);
                            -
                            22
                            76
723     num4=input(string,);
                          -
                          22
                          76
724     num5=input(string,1234);
                          ----
                          85
                          76
ERROR 22-322: Expecting a format name.
ERROR 76-322: Syntax error, statement will be ignored.
ERROR 85-322: Expecting a format name.
725   cards;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.03 seconds
					
				
			
			
				
			
			
			
			
			
			
			
		It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
