Greetings,
1. I would like to import 100 TXT files to 100 rows in one dataset.
2. To read 1 file to 1 row in 1 Dataset, I can use the following code:
data test;
length text $32767;
retain text '';
infile "C:\Test.txt" flowover dlmstr='//' end=last;
input;
text=cats(text,_infile_);
if last then output;
run;
3. However, I would like to run import 100 TXT files (from the same directory) to 100 rows in one dataset.
Thanks in advance!
D
This use wildcard in the INFILE statement.
I added a variable file to show where each record comes from. DLMSTR doesn't seem to do anything. Are you trying to keep the TERMSTR on the records in TEXT?
data file;
length file filenm $64;
retain file;
infile '~/t00*.dat' eov=eov end=end dlmstr='//' flowover filename=filenm;
length text $32767;
retain text;
input;
if _n_ eq 1 or eov then do;
if eov then output;
file=filenm;
eov=0;
call missing(text);
end;
text=cats(text,_infile_);
if end then output;
run;
Do your txt files have column headings?
Are you absolutely sure that 32767 characters will hold the entire contents of each file?
did you try?
infile "C:\path:\*.txt" flowover dlmstr='//' end=last;
where path is the path to the folder
This will attempt to read all of the files with txt as the extension in the folder. If there is a more specific pattern to the file names you need to extract then more details on the names is needed.
This use wildcard in the INFILE statement.
I added a variable file to show where each record comes from. DLMSTR doesn't seem to do anything. Are you trying to keep the TERMSTR on the records in TEXT?
data file;
length file filenm $64;
retain file;
infile '~/t00*.dat' eov=eov end=end dlmstr='//' flowover filename=filenm;
length text $32767;
retain text;
input;
if _n_ eq 1 or eov then do;
if eov then output;
file=filenm;
eov=0;
call missing(text);
end;
text=cats(text,_infile_);
if end then output;
run;
This version reads the file 32767 bytes at a time TERMSTRs and all into an array of TEXT variables. This will capture all text for files that may be larger than 32767. Adjust array diminsion as needed.
data file;
length file filenm $64;
retain file;
array text[10] $32767;
retain text;
infile '~/*.dat' eov=eov end=end filename=filenm recfm=F lrecl=32767;
input;
*list;
if _n_ eq 1 or eov then do;
if eov then output;
file=filenm;
eov=0;
call missing(of i text[*]);
end;
i + 1;
text[i]=_infile_;
if end then output;
run;
Thanks! That also working great. As you said, it is working for texts above 32K chars.
/* T1003210 load a file in one character variable(mystring) of length 128,004.
Associated problem. We really need to what the op wants to do with the
array of 32k chunks.
I load a file in one character variable(mystring) of length 128,004.
This type of problem is best solved with Perl or Python.
I present an R solution because I am a little more skilled in R.
inspired
https://goo.gl/eOu829
https://communities.sas.com/t5/SAS-Procedures/How-to-import-100-TXT-files-to-100-Rows-in-one-dataset/m-p/343429
HAVE (A file with two 64000 byte records)
=========================================
FILE: d:/txt/fyl1
RECORD 1: 64002 bytes first recrord (crlf once)
RECORD 2: 128004 bytes full file (crlf twice)
WANT ( load the the file both records into one character variable of length 128,004 bytes)
==========================================================================================
Pull some substrings out
1] "FILE LENGTH"
1] 128004
1] "substr(mystring,60001,60010)"
1] "1234567890"
1] "substr(mystring,30001,30010)"
1] "1234567890"
* _ _ _
_ __ ___ __ _| | _____ __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| | < __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___| \__,_|\__,_|\__\__,_|
;
data _null_;
length txt1 txt2 $32000;
txt1=repeat('1234567890',3199);
txt2=txt1;
do fyls=1 to 1;
fylvar=cats('d:/txt/fyl',put(fyls,1.));
file dummy filevar=fylvar lrecl=64000 recfm=v;
put txt1 +(-1) txt2;
put txt1 +(-1) txt2;
end;
run;quit;
NOTE: 2 records were written to the file DUMMY.
The minimum record length was 64000.
*____
| _ \
| |_) |
| _ <
|_| \_\
;
%utl_submit_r64('
library(readr);
mystring <- read_file("d:/txt/fyl1");
"FILE LENGTH";
nchar(mystring);
"substr(mystring,60001,60010)";
substr(mystring,60001,60010);
"substr(mystring,30001,30010)";
substr(mystring,30001,30010);
');
1] "FILE LENGTH"
1] 128004
1] "substr(mystring,60001,60010)"
1] "1234567890"
1] "substr(mystring,30001,30010)"
1] "1234567890"
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.