I am using a sas array to define 40 consecutive 1-byte fields starting at pos 11. These are character fields hence the '--'. But i am getting this error when I try to run the pgm. What am i doing wrong? Thank you !!
This isn't really a SAS Array question. I don't see a DO loop in your program, where you would be treating the 40 variables like an array, nor do I see an ARRAY statement which would set up the array referencing structure.
A SAS array is NOT a permanent data construct. It is a way to group variables together and reference them like they were in an array. If you create vars Ind1-Ind40, SAS stores those variables as 40 separate variables. If you have a program that references those variables in an ARRAY statement and you have a DO loop that does something in the array:
array ind $ ind1-ind40;
do i = 1 to 40;
if ind(i) = 'a' then
put ind(i)= i=;
--- you might use the reference IND(i), but SAS never stores IND(i) -- it only stores IND1-IND40. And, you don't need an ARRAY to read the data into SAS format.
However, to indicate a character variable in an INPUT statement, you need to use the $ to tell the INPUT statement that you are reading a character value (and not a number). Generally the "--' indicator has meaning as a way to reference all the variables between two variables, as they are stored in the descriptor portion of the SAS dataset. However, in your program, you don't -have- a descriptor portion yet -- your INPUT statement is just -building- one, so you'd have to use a different form of INPUT statement for reading a series of values into a bunch of related variable names:
options nocenter linesize=200;
proc print data=a1;
title 'what does the data look like';
proc contents data=work.a1;
title 'what variables were created in work.a1';
Note how the INPUT statement has the (IND1-IND40) and ($1.) -- the way I think of describing how this INPUT statement works is:
starting at position 11 in the INPUT file, make me 40 variables named IND1-IND40 by using the $1. informat for each of the 40 variables. What happens is that SAS reuses the informat in the parentheses over and over again until it gets to the end of the dataline.
Or, if you didn't want numbered variables, then you'd have to do something like this:
INPUT @01 EID $CHAR10.
@11 (Indaba Indabb Indabc Indabd Indabe
Indabf Indabg Indabh Indabi Indabj
Indabk Indabl Indabm Indabn Indabo
Indabp Indabq Indabr Indabs Indabt
Indabu Indabv Indabw Indabx Indaby
Indabz Indbba Indbbb Indbbc Indbbd
Indbbe Indbbf Indbbg Indbbh Indbbi
Indbbj Indbbk Indbbl Indbbm Indbbn) ($1.);
But, you can't just use the indaba--indbbn list method in the INPUT statement because it is the INPUT statement that is putting those variable names in the descriptor of the SAS dataset. I suppose if you have a LENGTH statement that listed all the variables, then you could use the -- reference ...but if you're going to need to list them, anyway, may as well do it in the INPUT statement.
Then LATER, if you need to reference these variables in a variable list, you could do:
array ind $ ind1-ind40;
array othr $ indaba--indbbn;
I'd like to suggest to KevinC that he should just use one variable, $40 wide.
My reasoning is that it is so very much simpler to load and pass around one item than 40.
If and when the individual indicators are required, they can be pulled out of the collection with a simple substring function.
Is there a contra- argument that suggests the 40 are more convenient and informative when separate than as a collection?