I could see the data for the top 10 observations. Some of variables are continuous and should be numeric. However, SAS shows them character variables and I can not get the mean with proc means because it gives me this error "'Variable VAR4680 in list does not match type prescribed for this list". I tried to convert character variable to numeric variables by some codes. However, the mean and SD that I got is not what I expected to get for that variable. I can not see the character data once I use proc print for 10 observations.
data kl;
set kk;
VAR4679_num=input(VAR4679, best32.);
run;
proc means data=kl;
var VAR4679_num;
run;
@Manije72 wrote:
Does it only prevent viewing the header? I mean, do the variables with generic names still contain the actual data in SAS?
Yes, but. It might be you accidentally forgot to specify GUESSINGROW=MAX when you ran PROC IMPORT so it only use the first few lines to guess how to define the variables.
Is this code correct to import the data in SAS:
proc import datafile='path_to_my_file.csv'
out=data
dbms=csv
replace;
getnames=yes;
run;
Only if you know the file has less than 20 observations.
Check out the GUESSINGROWS= statement of PROC IMPORT.
@Manije72 wrote:
Is this code correct to import the data in SAS:
proc import datafile='path_to_my_file.csv'
out=data
dbms=csv
replace;
getnames=yes;run;
Plain answer: a resounding NO.
NEVER use PROC IMPORT for CSV files. ALWAYS write the data step yourself, according to the file documentation.
Your issues are the consequence of letting the procedure make guesses.
@Manije72 wrote:
Hi,
I am working with a large dataset containing 500,000 rows and 4,000 columns, which requires 10 GB of memory. I imported the data into SAS, but only a portion of the data was successfully loaded, and many variables were not detected during analysis. My computer has limited memory, with only 16 GB available. Consequently, SAS is unable to open the entire dataset. How can I open the full dataset and ensure all variables are available for analysis in SAS?
This what would be considered big data by many. 10GB would also seem to be a lot of text fields I'm guessing? What is the analysis plan for those text fields?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.