BookmarkSubscribeRSS Feed
mlogan
Lapis Lazuli | Level 10

I need to import a text file in csv format with text qualifiers. I tried PROC IMPORT but it didn't recognize the comma(,) as a delimiter in the file.
Can anyone help me with the code for sas import that will take care of both the comma (as delimited) and text qualifier? Please note my data has " " where there is a missing variable. I will highly appreciate any help.

 

Here is how my data looks like:

 

KeyInd,Addr1,city,ID1,ID2, Name, Country
Individual,"42nd Street","New York",09000033,04531-027-0031813," ", "United States"
Individual,"9 Glen Echo Crt",Buffalo,04531-079-0035670,04531-027-0031819, "Joe Smith"," "

Individual,"9 Glen Echo Crt",Buffalo,04531-079-0035670,04531-027-0031819, "Joe Smith",Canada

12 REPLIES 12
Tom
Super User Tom
Super User

Why would you use "import" for such a simple data file?

Just write the data step to read it. SAS will automatically treat blanks as missing.  Just pick reasonable values for the lengths of your variables. You can always check the data after you have read it and adjust the program and re-run it.

Note that none of your variables look like numbers to me.

data want;
  infile datalines dsd firstobs=2 truncover;
  length KeyInd $20 Addr1 $100 city $40 ID1 ID2 $20 Name $40 Country $40 ;
  input KeyInd--Country;
datalines4;
KeyInd,Addr1,city,ID1,ID2, Name, Country
Individual,"42nd Street","New York",09000033,04531-027-0031813," ", "United States"
Individual,"9 Glen Echo Crt",Buffalo,04531-079-0035670,04531-027-0031819, "Joe Smith"," "
Individual,"9 Glen Echo Crt",Buffalo,04531-079-0035670,04531-027-0031819, "Joe Smith",Canada
;;;;

To read from a file instead replace the DATALINES keyword in the INFILE statement with the reference (simplest is quoted physical name of the file) to your source file and replace the in-line data with a RUN statement.

mlogan
Lapis Lazuli | Level 10
Hi Tom,
Thanks for your help, but I have 1000+ variables. So I really can't define them in data step, I mean it will be too much. Is there a easier way?
Kurt_Bremser
Super User

@mlogan wrote:
Hi Tom,
Thanks for your help, but I have 1000+ variables. So I really can't define them in data step, I mean it will be too much. Is there a easier way?


??

 

Your original post shows just 6 (six) columns.

 

More than 1000 columns points to a massive failure in process design, BTW.

Tom
Super User Tom
Super User

More than 1,000 variables?  That seems excessive.

Do you have data about the variables (metadata)?  
If so then use the metadata to write the program to read the text file.

If not then you will not have much hope of making heads or tails of 1,000+ variables.

 

Also more than 1,000 variables means that you are in danger of having more lines longer than 32K, which might cause trouble for PROC IMPORT.  Especially if the header row is too long such that some of the headers are not read.

 

Patrick
Opal | Level 21

@mlogan 

You could use the import wizard (SAS EG or Studio) which will generate the data step for you.

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

do you have the sample of the SAS code you are using?

 

dlm=',';

 

Jagadishkatam
Amethyst | Level 16

Hope you tried the below code

 

proc import datafile='path' dbms=csv replace out=want dlm=',';
getnames=yes; run;
Thanks,
Jag
ezi
Calcite | Level 5 ezi
Calcite | Level 5

dlm=',' gives an error, use instead:

; delimiter = ',' 

Tom
Super User Tom
Super User

With PROC IMPORT you have to use the DELIMITER= statement to change the delimiter.  

But if you use DBMS=CSV then the delimiter will already default to a comma so you don't need the DELIMITER= statement.

 

They were probably thinking of the DLM= option on the INFILE statement for when you write your own data step to read the text file.  But again if you use the DSD option on the INFILE statement then the default delimiter is a comma so you don't have to add the DLM= option there either.

RacheLGomez123
Fluorite | Level 6
When importing a CSV file that contains text fields with special characters such as commas, you can use a text qualifier to indicate that the text should be treated as a single field. Here are the steps for importing a CSV file with a text qualifier:

Open the program you are using to import the CSV file (such as Microsoft Excel, Google Sheets, or a database program).
Go to the "File" menu and choose "Import" or "Open."
In the import wizard or dialog box, select "CSV" as the file type.
Choose the file you want to import and click "Next" or "Continue."
In the "Text Qualifier" field, enter the character that you want to use as the text qualifier. The default text qualifier is double quotes (").
Preview the data and make sure that the fields are being correctly separated and that the text qualifiers are being used to encapsulate text fields with special characters.
Complete the import process and save the data in your desired format.
Regards,
Rachel Gomez
Kurt_Bremser
Super User

Opening and saving a text file with something like Excel will in essence destroy it. The software will have its own ideas (like converting the code 1.1 into a date) about the content.

 

There's only one way to deal with undocumented text files: open with a text editor, and write the code according to what you see, and what you can make of it out of your experience.

If the file is documented (as it should  be), you write the code according to the doc.

arun_guru
Calcite | Level 5

I used data step to import with the following option, this resolved the issue with unwanted quotes I got for the last column. Hope this helps. Thanks.

 

TERMSTR= option—Specifies what end-of-line character to use for a file. This option is specific to ASCII
operating systems and is documented in the SAS 9.2 companion for your operating system. This option is useful
when you want to share data files that are created on one operating system with another operating system. For
example, if you are working in a UNIX environment and you need to read a file that was created under Windows,
use TERMSTR=CRLF. Similarly, if you are in a Windows environment and you need to read a file that was
created under UNIX, use TERMSTR=LF.
The following INFILE statement illustrates reading a MAC file from a PC:
infile 'C:\mydata\test.dat' dsd dlm='|' termstr=cr;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 4300 views
  • 3 likes
  • 9 in conversation