Hello!
Since this is my first message, I would like thank you to all the SAS community, because I learnt a lot (still I need a lot to learn...) with this forum.
I've been importing many delimited files succesfully, but I'm struggling to import a .dat file without delimiters (with "fixed width"). The file has almost 2500000 observations, but SAS only reads 1431879. If I set up option "firstobs = 1431880" to read the next observation, SAS doesn't read anything. When I read the first 1431879 obs, the file looks "good" (all vars seems to be ok) but, you know, "a few" rows are missing.
I've been also trying with truncover, missover, etc. but nothing worked... Also changed RECFM and LRECL options, but found none that worked for me (I'm not used with these options, though). As you can see, I'm a newbie (just switching to SAS), and I've been reading a lot of information but I didn't succeed.
Any help, please?
DATA want;
infile "C:\filename.DAT" truncover;
input VAR1$ 1-1
VAR2 2-9
VAR3 10-14
VAR4 15-19
VAR5 20-21
VAR6 22-23
VAR7 24-27
VAR8 28-29;
run;
PD: The file has hundreds of vars (I just wrote an example) but, in case it might help, the original .dat file contains data like this (it's not a delimited file):
A123456789121
B125455479121
Thank you very much in advance.
Best regards,
Marc
EDIT: This is the log I get:
NOTE: The infile "C:\filename.DAT" is:
Nombre archivo=C:\filename.DAT,
RECFM=V,LRECL=32767,
Tamaño de archivo (bytes)=25841609304,
Última modificación=04 de abril de 2021 21H44,
Create Time=04 de febrero de 2014 14H51
NOTE: 1431879 records were read from the infile "C:\filename.DAT".
The minimum record length was 665.
The maximum record length was 9000.
NOTE: The data set WORK.WANT has 1431879 observations and 8 variables.
NOTE: Sentencia DATA used (Total process time):
real time 1:24.44
cpu time 25.87 seconds
So based on those numbers the job is stopping before the full file is read.
2787 data check; 2788 bytes = 25841609304 ; 2789 records = 1431879 ; 2790 average = bytes / records ; 2791 min = 665 ; 2792 max = 9000 ; 2793 put (_all_) (= comma20. /); 2794 run; bytes=25,841,609,304 records=1,431,879 average=18,047 min=665 max=9,000
So probably you have the DOS end of file character about half way through the file and SAS is seeing that as marking the end of the file. Tell it to ignore the DOS end of file character.
infile "C:\filename.DAT" truncover ignoredoseof;
Did the source of the file provide a description any where?
As long as your LRECL is at least 9000 that would not have any issue. RECFM likely wouldn't have any issue either.
What tells you that you have 2500000 observations in the source file?
You may want to post the entire data step from the log along with any other messages.
So based on those numbers the job is stopping before the full file is read.
2787 data check; 2788 bytes = 25841609304 ; 2789 records = 1431879 ; 2790 average = bytes / records ; 2791 min = 665 ; 2792 max = 9000 ; 2793 put (_all_) (= comma20. /); 2794 run; bytes=25,841,609,304 records=1,431,879 average=18,047 min=665 max=9,000
So probably you have the DOS end of file character about half way through the file and SAS is seeing that as marking the end of the file. Tell it to ignore the DOS end of file character.
infile "C:\filename.DAT" truncover ignoredoseof;
Thank you very much!!!! I used the ignoredoseof option, and it worked! SAS read the whole file! 🙂
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.