BookmarkSubscribeRSS Feed
1162
Calcite | Level 5
I'm trying files from a certain directory. There are many files in this directory, but I'm only interested in importing the ones that start "Run2_08". The number of files changes, so I can't hard code the pathnames. I'm trying this solution, but I can't seem to actually import the data. I keep getting a LOST CARD warning and only one file seems to get processed.

Any help?

[pre]
data tmp;
rc = filename("path", "C:\CA_TEMP");
did = dopen("path");
count = dnum(did);
do i = 1 to count;
name = dread(did,i);
if substr(name,1,7) = 'Run2_08' then do;
fc = filename("file", "C:\CA_TEMP\"||name);
do until (eof);
infile file dlm="," firstobs=2 obs=2 dsd missover end=eof;
input date :yymmdd6. Code :$6. Total :8. Neg :8. Pos :8.;
output;
end;
end;
end;
run;
[/pre]

This is my log:
[pre]
NOTE: The infile FILE is:
File Name=C:\CA_TEMP\Run2_080703.txt,
RECFM=V,LRECL=256

NOTE: LOST CARD.
rc=0 did=1 count=30 i=15 name=Run2_080626.txt fc=20036 eof=1 date=17716 Code=MT43 Total=227 Neg=0 Pos=227 _ERROR_=1 _N_=1
NOTE: 1 record was read from the infile FILE.
The minimum record length was 23.
The maximum record length was 23.
NOTE: The data set WORK.TMP has 1 observations and 11 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
[/pre]
5 REPLIES 5
Olivier
Pyrite | Level 9
The way I would do that is to start the Data step as you did, but add a FILEVAR option to the INFILE statement to tell SAS which file it must be reading exactly. And don't forget the final STOP statement, otherwise the loop will go on forever re-reading your data again and again.
PS : I changed the SUBSTR(...)= for a more simple =: (begins with), and used a SAS-9 CATS function.
[pre]
DATA work.test ;
rc = FILENAME("dir","c:\temp") ;
did = DOPEN("dir") ;
DO i = 1 TO DNUM(did) ;
name = CATS("c:\temp\",DREAD(did, i)) ;
IF DREAD(did, i) =: "Run2_08" THEN DO ;
INFILE dummy FILEVAR = name DLM="," FIRSTOBS=2 MISSOVER DSD END=eof ;
DO UNTIL(eof) ;
INPUT date :YYMMDD6. code :$6 Total :8. Neg :8. Pos :8.;
OUTPUT ;
END ;
END ;
END ;
STOP ;
rc = DCLOSE(did) ;
RUN ;
[/pre]
Regards.
Olivier
Peter_C
Rhodochrosite | Level 12
I could explain the behaviour you experienced, but think you just want a result.
So...
because you are only interested in the files with names of a fixed shape (and because you are not running SAS on z/OS) you can use the "global naming characters" like *.txt, on your infile statement.
This will make your program very much simpler, except for one thing, you want to ignore row1 of each file.... this example shows how>>>>

* change current directory, to folder of files starting "Run2_08" ;[pre] x cd 'this directory' ;

data tmp ;[/pre] * to simplify the input statement, pre-define your vars in the order in which they appear in the file ;[pre] length date 6 code $6 total pos 8 ;
attrib date informat= yymmdd6. format= date9. ;[/pre]* My preference to predefine lengths and informat(s) removes any need to define widths on input statement which becomes important when CSV data style makes these widths vary ;

* I assume you would like to know from which file each data row
has come, so define a var for the infile option and one to keep .[pre] length filen $250 ;
retain filename ;
infile "Run2_08*.*" dsd missover filename= filen ;
[/pre]* load input buffer, to discover which file is being read; [pre] input @ ; [/pre]
* when the filename changes you are at the beginning of a file, so you need to collect name, and drop header; [pre] if lag(filen) ne filen then do;
filename = filen ;
delete ;
end;
[/pre]* because your input columns are defined in order, you can use the -- style variable list syntax ;[pre] input date -- Pos ;

run ;
[/pre]
hope that helps

PeterC
Olivier
Pyrite | Level 9
Hi Peter.
The INFILE "*.txt" is a great trick. Thanks for sharing that.
When I use the FILENAME option, I only get the directory, not the file name itself. And I thought of another way to check the first record without using the LAG option : the EOV option in the INFILE statement.
[pre]
DATA work.import ;
INFILE "c:\temp\run2_08????.txt"
DLM=";"
FIRSTOBS=2
MISSOVER
DSD
EOV=beginNewFile ;
INPUT @ ;
IF beginNewFile=1 THEN DO ;
beginNewFile=0 ;
DELETE ;
END ;
ELSE INPUT date :yymmdd6. code :$6. (total pos neg) (:8.) ;
FORMAT date DDMMYY10. ;
RUN ;
[/pre]
As the SAS doc says, you have to revert the EOV variable by yourself, because SAS automatically only sets it to 1 when beginning a new file.
Regards & thanks again.
Olivier
deleted_user
Not applicable
I'm surprised you were not able to make my method work!
You need to pre-set the length for the filename= variable or it defaults to $8.
You need to "load the buffer" before the variable is filled, hence the [pre] input @ ;[/pre]I don't use EOV only because I expect to use the filename= option value, and because the EOV is not set on the first file....although I guess you could use [pre] RETAIN beginNewFile 1 ;[/pre]

Glad you've got a solution that suits you..

PeterC
1162
Calcite | Level 5
These are all great suggestions. Thanks for your help.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1083 views
  • 0 likes
  • 4 in conversation