Hmm...there are ways around that, but you have 100 files that don't have the same structure? The name isn't available anywhere else?
Each file has different data, with different variables/variable names. The last variable name for every file is always missing (ie VAR#). The raw text files have the correct variable name, just like for the other variables that import properly. The variable is named "Description" in the raw input below. Notice there is no delimiter at the end of the line, I'm guessing thats why it doesn't read the name in, although later rows are the same and read in correctly.
EFF_START_DATE|EFF_END_DATE|CODE|DESCRIPTION
9/01/2001 12:00:00 AM|12/31/9999 12:00:00 AM|AAA|EXAMPLE
9/01/2001 12:00:00 AM|12/31/9999 12:00:00 AM|BBB|EXAMPLE
9/01/2001 12:00:00 AM|12/31/9999 12:00:00 AM|CCC|EXAMPLE
There must be something in the file. Use a data step to read the first few lines and see what it is.
data _null_;
infile 'myfile' obs=3 ;
input;
list;
run;
You example data works fine with PROC IMPORT.
If you paste the sample data above into a blank notepad and save it as a .txt you can replicate the issue II'm having. This makes me doubt its the file formatting and I'm sort of at a loss for ideas.
I can replicate the error by appending a tab or other invalid character to the end of the header line.
filename tst temp;
data _null_;
file tst ;
input ;
_infile_=trim(_infile_);
put _infile_ '09'x;
cards;
EFF_START_DATE|EFF_END_DATE|CODE|DESCRIPTION
9/01/2001 12:00:00 AM|12/31/9999 12:00:00 AM|AAA|EXAMPLE
9/01/2001 12:00:00 AM|12/31/9999 12:00:00 AM|BBB|EXAMPLE
9/01/2001 12:00:00 AM|12/31/9999 12:00:00 AM|CCC|EXAMPLE
;
proc import datafile=tst out=tst replace
dbms=dlm
;
delimiter='|';
run;
Not sure why the LIST statement is no longer treating 'A0'X as a character that forces it to show the hex codes for the line. Perhaps you have the 'A0'x character after the last variable name? Microsoft thinks this character means a non-breaking space and sticks into at a lot of unwanted spaces. Try this code to see the last character on the first line of the file.
data _null_;
infile '...file...' obs=1 ;
input;
char=substr(_infile_,length(_infile_),1);
put char= $hex2. ;
run;
It is due to the termstr format of the text file. This macro determined which one the file uses and imports properly for both crlf and lf.
%macro initial(file, handle_name, other_filename_options=) ;
/* if there is a carriage return at the end, then return 1 (stored in macro variable SYSRC) */
%sysexec head -n 1 "&file" | awk '/\r$/ { exit(1) }' ;
%if &SYsrc=1 %then %let termstr=crlf ;
%else %let termstr=lf ;
filename &handle_name "&file" termstr=&termstr &other_filename_options ;
options mprint;
%mend ;
%initial(file=data1, handle_name=A);
proc import DATAFILE=A DBMS=DLM REPLACE OUT=Imported_Data ;
DELIMITER='|' ; GUESSINGROWS=32767;
run ;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.