I need to import a text file in csv format with text qualifiers. I tried PROC IMPORT but it didn't recognize the comma(,) as a delimiter in the file.
Can anyone help me with the code for sas import that will take care of both the comma (as delimited) and text qualifier? Please note my data has " " where there is a missing variable. I will highly appreciate any help.
Here is how my data looks like:
KeyInd,Addr1,city,ID1,ID2, Name, Country
Individual,"42nd Street","New York",09000033,04531-027-0031813," ", "United States"
Individual,"9 Glen Echo Crt",Buffalo,04531-079-0035670,04531-027-0031819, "Joe Smith"," "
Individual,"9 Glen Echo Crt",Buffalo,04531-079-0035670,04531-027-0031819, "Joe Smith",Canada
Why would you use "import" for such a simple data file?
Just write the data step to read it. SAS will automatically treat blanks as missing. Just pick reasonable values for the lengths of your variables. You can always check the data after you have read it and adjust the program and re-run it.
Note that none of your variables look like numbers to me.
data want;
infile datalines dsd firstobs=2 truncover;
length KeyInd $20 Addr1 $100 city $40 ID1 ID2 $20 Name $40 Country $40 ;
input KeyInd--Country;
datalines4;
KeyInd,Addr1,city,ID1,ID2, Name, Country
Individual,"42nd Street","New York",09000033,04531-027-0031813," ", "United States"
Individual,"9 Glen Echo Crt",Buffalo,04531-079-0035670,04531-027-0031819, "Joe Smith"," "
Individual,"9 Glen Echo Crt",Buffalo,04531-079-0035670,04531-027-0031819, "Joe Smith",Canada
;;;;
To read from a file instead replace the DATALINES keyword in the INFILE statement with the reference (simplest is quoted physical name of the file) to your source file and replace the in-line data with a RUN statement.
@mlogan wrote:
Hi Tom,
Thanks for your help, but I have 1000+ variables. So I really can't define them in data step, I mean it will be too much. Is there a easier way?
??
Your original post shows just 6 (six) columns.
More than 1000 columns points to a massive failure in process design, BTW.
More than 1,000 variables? That seems excessive.
Do you have data about the variables (metadata)?
If so then use the metadata to write the program to read the text file.
If not then you will not have much hope of making heads or tails of 1,000+ variables.
Also more than 1,000 variables means that you are in danger of having more lines longer than 32K, which might cause trouble for PROC IMPORT. Especially if the header row is too long such that some of the headers are not read.
You could use the import wizard (SAS EG or Studio) which will generate the data step for you.
do you have the sample of the SAS code you are using?
dlm=',';
Hope you tried the below code
proc import datafile='path' dbms=csv replace out=want dlm=',';
getnames=yes;
run;
dlm=',' gives an error, use instead:
; delimiter = ','
With PROC IMPORT you have to use the DELIMITER= statement to change the delimiter.
But if you use DBMS=CSV then the delimiter will already default to a comma so you don't need the DELIMITER= statement.
They were probably thinking of the DLM= option on the INFILE statement for when you write your own data step to read the text file. But again if you use the DSD option on the INFILE statement then the default delimiter is a comma so you don't have to add the DLM= option there either.
Opening and saving a text file with something like Excel will in essence destroy it. The software will have its own ideas (like converting the code 1.1 into a date) about the content.
There's only one way to deal with undocumented text files: open with a text editor, and write the code according to what you see, and what you can make of it out of your experience.
If the file is documented (as it should be), you write the code according to the doc.
I used data step to import with the following option, this resolved the issue with unwanted quotes I got for the last column. Hope this helps. Thanks.
TERMSTR= option—Specifies what end-of-line character to use for a file. This option is specific to ASCII
operating systems and is documented in the SAS 9.2 companion for your operating system. This option is useful
when you want to share data files that are created on one operating system with another operating system. For
example, if you are working in a UNIX environment and you need to read a file that was created under Windows,
use TERMSTR=CRLF. Similarly, if you are in a Windows environment and you need to read a file that was
created under UNIX, use TERMSTR=LF.
The following INFILE statement illustrates reading a MAC file from a PC:
infile 'C:\mydata\test.dat' dsd dlm='|' termstr=cr;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.