Hello All,
I am running a query against the database and writing results to a tab-delimited text file. I have data missing for some columns. When I do a proc import, some of the data seems to be contained in one column instead of separate columns.
Here is the script I am using below. I have also attached the test file. There are totally 23 columns in the original table.
Proc Import Script:
LIBNAME vertsas 'test/user';
PROC IMPORT DATAFILE="testFile.txt"
OUT=vertsas.testFile
DBMS=dlm
REPLACE;
DELIMITER=’09’x;
GETNAMES=yes;
RUN;
When I generate the sas dataset and open it in Enterprise guide, I see the issue mentioned above
Please let me know if this is an encoding issue or a delimiter issue.
Thanks!!
You don't have column headers in your source text file. Your code works for me once I change the syntax accordingly to "getnames=no".
Also: Opening your source text file with Notepad++ it tells me that the encoding is "UTF-8 without BOM". To be on the safe side in regards or character translation to single byte encoding you might want to explicitly tell SAS that the source is UTF-8 as done below.
LIBNAME vertsas 'C:\test';
filename source 'C:\test\testFile.txt' encoding="utf-8" lrecl=32767;
PROC IMPORT DATAFILE=source
OUT=vertsas.testFile
DBMS=dlm
REPLACE;
DELIMITER='09'x;
GETNAMES=no;
RUN;
proc contents data=vertsas.testfile;
run;quit;
There are 25 columns using a tab delimiter in the example file. There are 2 before the first variable which are possibly the unexpected ones. The example data also looks like there not any actual names on the columns.
Also the posted code DELIMITER=’09’x; has the "wrong" type of single quote, at least to execute as shown.
Sorry, I am beginner with SAS. what's the correct quote to be used?
Also, Does number of columns in text file matter?
Do I need to provide column names in the input text file for the script to work correctly?
Thanks again!
The quote may have been changed based on how you pasted the source code, if you went through a program such as Word it may have changed a ' to the ’. In this forum it is somewhat difficult to see but the one posted is the curly version not the straight version.
If you don't have actual column names then don't use the Getnames option. You can use proc datasets to rename the VAR1 to VARN and add descriptive labels after importing the data.
The number of columns is going to be defined by your delimiter when used. If you don't want VAR1 and VAR2 then they can be dropped later.
OR you can write a data step to read the text file using an INPUT statement and INFILE options. The data step approach lets you read the names as you might want, special informats such as for date or time values, assign labels and display formats as well as dropping specific variables or additional manipulation as wanted.
You don't have column headers in your source text file. Your code works for me once I change the syntax accordingly to "getnames=no".
Also: Opening your source text file with Notepad++ it tells me that the encoding is "UTF-8 without BOM". To be on the safe side in regards or character translation to single byte encoding you might want to explicitly tell SAS that the source is UTF-8 as done below.
LIBNAME vertsas 'C:\test';
filename source 'C:\test\testFile.txt' encoding="utf-8" lrecl=32767;
PROC IMPORT DATAFILE=source
OUT=vertsas.testFile
DBMS=dlm
REPLACE;
DELIMITER='09'x;
GETNAMES=no;
RUN;
proc contents data=vertsas.testfile;
run;quit;
Thanks Patrick. That helped.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.