Don't you mean you will have 5512860 variables or data values. This is what I get with my program that simulates the construction of your input data. When I use 82 I get 3321 as in 18F.txt
Hi,
Sorry for replying you late. Here is the link to download my .txt file. http://dropbox.unl.edu/uploads/20141112/2cb2cc0a5d5e994b/18Fratio.txt
It does have 11,025,722 variables, not just 5512860, and only 1 obeservation right now.
I tried your code posted the day before yesterday, SAS read all the variable names but got the error when using PROC TRANSPOSE to create "name2" ("ERROR: The SAS System stopped processing this step because of insufficient memory.").
And your code posted yesterday, I am not quite understand it and SAS didn't get the right results.
Thank you!
If your variables are all numeric (as in your example), then you could use something like:
data temp (keep=name);
infile 'c:\18f.txt' dlm='09'x lrecl=50000 obs=1;
informat var1-var3321 name $16.;
array names(*) $16. var1-var3321;
input var1-var3321;
do i=1 to 3321;
name=names(i);
output;
end;
run;
proc sql noprint;
select name into :names
separated by ' '
from temp
;
quit;
data want;
infile 'c:\18f.txt' dlm='09'x lrecl=32767 firstobs=2;
input &names.;
run;
Thank you so much. I tried your code and it did work!
Can you tell me what does "lrecl=50000" mean? Is it the length for a single line? The file I attached is just my test sample. Actually I have a .txt file including around 11 million variables (only one is charater variable, and others are numeric) and 360 oberservations. Do you think SAS can load it successfully?
Thank you!
Elaine: LRECL is the maximum number of characters on a record. If the LRECL option isn't used, or isn't set high enough, you will risk your program never even seeing the data you are trying to import.
As for your question about maximum number of variables, the following is from a SAS note (8213 - Understanding the maximum allowable size of SAS® data sets in the Windows, UNIX, and z/OS ope...😞
Assuming a single-byte character set, and that you use the maximum 352 bytes possible for name, label, and so on for each variable, you can have a maximum of about 4,050,000 variables. If the names, labels, and format names are shorter, you can have more than 200,000,000. There is a maximum of 1 GB to store all the variable names and other metadata (data set label, compression name, and so on).
Assuming the above limits are not exceeded, the maximum possible number of variables on either Windows or UNIX is 412,977,617 on 32-bit hosts and 2 GB on 64-bit hosts. For the z/OS platform, the maximum number of variables is primarily limited by the maximum size of a single observation (16 MB or 1,677,7216 bytes).
Thank you for your reply. I use 64 bit Windows system, and I am trying to load a large .txt file which has 11,025,720 variables. Each variable name has 21 charaters. It seems that it doesn't exceed SAS limit, but why SAS still reports "ERROR: The SAS System stopped processing this step because of insufficient memory." How does it happen?
Thank you so much!
11 million variables need a lot of memory (each numerical variable consumes 8 bytes, but you also have to handle the metadata, so with 21 characters as variable names you need another 11M*21 bytes for the names alone!)
You need to adjust the memsize option for your SAS process. This can either be done on the commandline or in the sasv9.cfg (or sasv9_usermods.cfg in the configuration directory of your Workspace Server). Have your SAS admin do that for you.
Do you know how to adjust SAS option by myself? I didn't find the right person to figure it out. Thanks!
Do you know how to adjust SAS option by myself? I didn't find the right person to figure it out. Thanks!
You need the SAS administrator for this, unless you start SAS (NOT the Enterprise Guide) from the command line or a link yourself.
He can either set -memsize XXXX as a command line option in the server definition in the Management Console, or he can use sasv9_usermods.cfg in the appropriate configuration directory to set the MEMSIZE option there.
Thanks. I will try it later. Actually I did import my huge data into SAS without any change for memsize option. However, the data is stored by rows not columns. I didn't find a way to transpose it. Maybe it is still because of the insufficient memory.
That's just too many variables to be manageable. Even if you could make that data set is that the form that will be needed for analysis? I'm guessing no.
11,025,720 observations on the other had is quite manageable. Reading the data into the structure I suggested yesterday will work and be much more useful for analysis.
The cause is there is a limit to the number of characters that SAS will allow for the name of a variable/column. When your columns have headers that are identical past that limit then it can't assign a meaningful name and resorts to VAR + the order in the dataset.
Right. I just looked at the results, not the file.
Sorry, do you mean SAS has a limit (32 characters) for each variable name? In my file, each name has only 13 charaters but still has the problem metioned above if I use PROC IMPORT.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.