Hi ,
I am trying to import a file in SAS, when I use excel verson to import, it imports the variables as characters although they are all numeric and when I import it as a csv file, although it corrects this error but fails to import all the data.. I am unsure how to resolve tis issue.
Cheers
Anum
what data are missing when you use csv? proc import generates code that you can then modify to ensure all variables are imported
Well, I cannot identify as its a huge file but SAS log says import unsuccessful , also I can see it says "invalid data for var5560 in line 278 32767-32767"
I can see SAS log telling me "invalid data for var6677 in line 326 32767-32767" for various variables
This means that your CSV records are longer than 32767 bytes.
How do you import the CSV data?
You must lengthen LRECL.
Hi,
I have tried doing that but it isn't solving my problem. I can see that even though my original file contains data upto 4 digits but when I am importing it in SAS it is converting some values to large decimal places like below
I guess this is creating problem. Any idea how can I solve this issue?
Numerical precision is a whole different issue.
Fix the record length issue first.
Answer the question and show the code used and the log.
I have been using the following code for fixing record length issue,
filename AR 'C:\Users\amalik\Desktop\dailydata\AR\DATEAR.XLSX' lrecl=27000; proc import datafile=AR dbms=XLSX out=AR replace; run;
But it is still importing all variables as characters
Sorry, I did replace it with a value greater than 32767
@Amalik wrote:
I can see SAS log telling me "invalid data for var6677 in line 326 32767-32767" for various variables
Gads, you have at least 6677 variables, which is very questionable in the first place. The specific message is saying that AFTER proc import determined what type of data var6677 should be (after examining the first few rows of data) that the value encountered on line 326 did not match. Most likely is SAS guessed that the variable is numeric and encountered something in the cell that does not match the layout of previous cells such as special characters like ()*$, or previous values were treated as dates and the value in the cell was not recognizable as a valid date format.
If you are using Proc Import with csv add the option Guessingrows=max. This option will have SAS examine more rows before assigning variable types, informats and formats.
Proc import for csv will generate the data step code used to read the file. Look in the log above all of the invalid data messages. There you can see the informats assigned for reading the values. Though with over 6000 variables there is going to be a lot of lines of code since the procedure generated one informat and format statement for each variable.
If indeed ALL of your variables are numeric something similar to this should work:
data want; infile "path and filename goes here.csv" dlm=',' lrecl=50000 truncover; informat var1-var7000 best32.; /*replace 7000 with the expected number of columns*/ input var1-var7000; run;
though you may still have to increase the lrecl.
Note there is also a possible issue with CSV created from Excel files of "phantom" columns. In csv you sometimes get rows that end with ,,,,,,,, (the number of , varies) representing something in the history of the Excel file that no longer contains data due to deletion but Excel considers those columns "used" and will export them to csv.
DO NOT use proc import. It's guessing will only cause trouble.
Inspect your csv file with a solid text editor like notepad++.
It will show you the file type (DOS, UNIX or Mac), which then helps you in using the correct TERMSTR in the infile statement, which most probably will alleviate your lrecl problem.
If you still have problems, post some lines of your csv as an attachment here.
I was finally able to find out the issue in the file. Actually the excel and csv version of files both contain * in them.
The number is shown as a number say 1.12 but when I use ctrl+F and find *, it finds the numbered cell. When I replace * with a blank, it replaces the entire number say 1.12 will replace with a blank.
Iam unsure on how to handle this weird issue. Any suggestions please?
Do not use Excel.
What value do you see when you open the CSV file with a text editor?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.