BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
HijB
Fluorite | Level 6

I'm importing a csv file into SAS which contains 300+ variables, I realized some of my character variables are not imported fully. in the csv file, the variable value is -for example- "abc def ghi jkl" in SAS as it only import the first 9 digits (length = 9) so it looks something like this "abc def g". how can I import all the variables exactly the same length as the csv file?

 

I'm using this code to import my file:

proc impot out= work.mydata

datafile = "location"

dbms=csv replace;

getnames = yes;

datarow=2;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

When SAS guessed what length to use for that variable it was not long enough for later values.

You need to tell SAS to check the whole file (or at least more of) by using the GUESSINGROWS= statement.  Set it to MAX to have it read the whole file before deciding how to define the variables.   You don't have to add the GETNAMES=YES or DATAROW=2 statements are those are the default values for those settings.

proc import 
  out= work.mydata replace
  datafile = "location"
  dbms=csv 
;
  guessingrows=max;
run;

If you want more control just write your own data step to read the file and you will have complete control over what names to use for the variables, their type and length. Whether or not to attach a FORMAT or a LABEL to the variable.  Whether or not you need to use a special INFORMAT to read the values from the CSV file. ETC. 

data mydata ;
  infile "location" dsd truncover firstobs=2;
  length var1 $20 var2 8 .... varlast $40 ;
  format var2 date9.;
  informat var2 date9.;
  input var1 var2 ... varlast;
run;

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

PROC IMPORT is a guessing procedure. It guesses the number of characters needed by looking at the first X row, and it uses that. If later there is a longer character string in that column, the string is truncated.

 

Especially for Excel files, it can guess wrong, and you cannot force it to make better guesses. If the file is a CSV or other delimited file, there are ways to force PROC IMPORT to change its guessing procedure, at the cost of spending more time creating the data set. See the GUESSINGROWS option for importing CSV and delimited files. 

https://documentation.sas.com/doc/en/pgmmvacdc/9.4/proc/p13kvtl8ezj13in17i6m99jypcwi.htm

 

If the file is an Excel file, you could save it as a .CSV and then import the .CSV file.

--
Paige Miller
ballardw
Super User

By default SAS only examines a few rows of the file before GUESSING things like variable type or length. So if the first few rows only have 5 or 6 characters for a  variable but at row 100 it has 500 characters the result is set to 6.

Use the GUESSINROWS=MAX option in proc import.

 

Or look at the DATA STEP code generated when proc import runs, copy that from the log to the editor and change the length in the INFORMAT statement for that variable. (remove line numbers and such). At the same time you can change variable names if you like. Save the code and the next time you read a file like this then the properties of the variables will stay the same. BEST is to read the documentation describing your data provided by the source so you can set the lengths and variable types correctly in the data step.

Tom
Super User Tom
Super User

When SAS guessed what length to use for that variable it was not long enough for later values.

You need to tell SAS to check the whole file (or at least more of) by using the GUESSINGROWS= statement.  Set it to MAX to have it read the whole file before deciding how to define the variables.   You don't have to add the GETNAMES=YES or DATAROW=2 statements are those are the default values for those settings.

proc import 
  out= work.mydata replace
  datafile = "location"
  dbms=csv 
;
  guessingrows=max;
run;

If you want more control just write your own data step to read the file and you will have complete control over what names to use for the variables, their type and length. Whether or not to attach a FORMAT or a LABEL to the variable.  Whether or not you need to use a special INFORMAT to read the values from the CSV file. ETC. 

data mydata ;
  infile "location" dsd truncover firstobs=2;
  length var1 $20 var2 8 .... varlast $40 ;
  format var2 date9.;
  informat var2 date9.;
  input var1 var2 ... varlast;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 717 views
  • 2 likes
  • 4 in conversation