I'm importing a csv file into SAS which contains 300+ variables, I realized some of my character variables are not imported fully. in the csv file, the variable value is -for example- "abc def ghi jkl" in SAS as it only import the first 9 digits (length = 9) so it looks something like this "abc def g". how can I import all the variables exactly the same length as the csv file?
I'm using this code to import my file:
proc impot out= work.mydata
datafile = "location"
dbms=csv replace;
getnames = yes;
datarow=2;
run;
When SAS guessed what length to use for that variable it was not long enough for later values.
You need to tell SAS to check the whole file (or at least more of) by using the GUESSINGROWS= statement. Set it to MAX to have it read the whole file before deciding how to define the variables. You don't have to add the GETNAMES=YES or DATAROW=2 statements are those are the default values for those settings.
proc import
out= work.mydata replace
datafile = "location"
dbms=csv
;
guessingrows=max;
run;
If you want more control just write your own data step to read the file and you will have complete control over what names to use for the variables, their type and length. Whether or not to attach a FORMAT or a LABEL to the variable. Whether or not you need to use a special INFORMAT to read the values from the CSV file. ETC.
data mydata ;
infile "location" dsd truncover firstobs=2;
length var1 $20 var2 8 .... varlast $40 ;
format var2 date9.;
informat var2 date9.;
input var1 var2 ... varlast;
run;
PROC IMPORT is a guessing procedure. It guesses the number of characters needed by looking at the first X row, and it uses that. If later there is a longer character string in that column, the string is truncated.
Especially for Excel files, it can guess wrong, and you cannot force it to make better guesses. If the file is a CSV or other delimited file, there are ways to force PROC IMPORT to change its guessing procedure, at the cost of spending more time creating the data set. See the GUESSINGROWS option for importing CSV and delimited files.
https://documentation.sas.com/doc/en/pgmmvacdc/9.4/proc/p13kvtl8ezj13in17i6m99jypcwi.htm
If the file is an Excel file, you could save it as a .CSV and then import the .CSV file.
By default SAS only examines a few rows of the file before GUESSING things like variable type or length. So if the first few rows only have 5 or 6 characters for a variable but at row 100 it has 500 characters the result is set to 6.
Use the GUESSINROWS=MAX option in proc import.
Or look at the DATA STEP code generated when proc import runs, copy that from the log to the editor and change the length in the INFORMAT statement for that variable. (remove line numbers and such). At the same time you can change variable names if you like. Save the code and the next time you read a file like this then the properties of the variables will stay the same. BEST is to read the documentation describing your data provided by the source so you can set the lengths and variable types correctly in the data step.
When SAS guessed what length to use for that variable it was not long enough for later values.
You need to tell SAS to check the whole file (or at least more of) by using the GUESSINGROWS= statement. Set it to MAX to have it read the whole file before deciding how to define the variables. You don't have to add the GETNAMES=YES or DATAROW=2 statements are those are the default values for those settings.
proc import
out= work.mydata replace
datafile = "location"
dbms=csv
;
guessingrows=max;
run;
If you want more control just write your own data step to read the file and you will have complete control over what names to use for the variables, their type and length. Whether or not to attach a FORMAT or a LABEL to the variable. Whether or not you need to use a special INFORMAT to read the values from the CSV file. ETC.
data mydata ;
infile "location" dsd truncover firstobs=2;
length var1 $20 var2 8 .... varlast $40 ;
format var2 date9.;
informat var2 date9.;
input var1 var2 ... varlast;
run;
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
For SAS newbies, this video is a great way to get started. James Harroun walks through the process using SAS Studio for SAS OnDemand for Academics, but the same steps apply to any analytics project.
Find more tutorials on the SAS Users YouTube channel.