Greetings,
1. I would like to import 100 TXT files to 100 rows in one dataset.
2. To read 1 file to 1 row in 1 Dataset, I can use the following code:
data test;
length text $32767;
retain text '';
infile "C:\Test.txt" flowover dlmstr='//' end=last;
input;
text=cats(text,_infile_);
if last then output;
run;
3. However, I would like to run import 100 TXT files (from the same directory) to 100 rows in one dataset.
Thanks in advance!
D
This use wildcard in the INFILE statement.
I added a variable file to show where each record comes from. DLMSTR doesn't seem to do anything. Are you trying to keep the TERMSTR on the records in TEXT?
data file;
length file filenm $64;
retain file;
infile '~/t00*.dat' eov=eov end=end dlmstr='//' flowover filename=filenm;
length text $32767;
retain text;
input;
if _n_ eq 1 or eov then do;
if eov then output;
file=filenm;
eov=0;
call missing(text);
end;
text=cats(text,_infile_);
if end then output;
run;
Do your txt files have column headings?
Are you absolutely sure that 32767 characters will hold the entire contents of each file?
did you try?
infile "C:\path:\*.txt" flowover dlmstr='//' end=last;
where path is the path to the folder
This will attempt to read all of the files with txt as the extension in the folder. If there is a more specific pattern to the file names you need to extract then more details on the names is needed.
This use wildcard in the INFILE statement.
I added a variable file to show where each record comes from. DLMSTR doesn't seem to do anything. Are you trying to keep the TERMSTR on the records in TEXT?
data file;
length file filenm $64;
retain file;
infile '~/t00*.dat' eov=eov end=end dlmstr='//' flowover filename=filenm;
length text $32767;
retain text;
input;
if _n_ eq 1 or eov then do;
if eov then output;
file=filenm;
eov=0;
call missing(text);
end;
text=cats(text,_infile_);
if end then output;
run;
This version reads the file 32767 bytes at a time TERMSTRs and all into an array of TEXT variables. This will capture all text for files that may be larger than 32767. Adjust array diminsion as needed.
data file;
length file filenm $64;
retain file;
array text[10] $32767;
retain text;
infile '~/*.dat' eov=eov end=end filename=filenm recfm=F lrecl=32767;
input;
*list;
if _n_ eq 1 or eov then do;
if eov then output;
file=filenm;
eov=0;
call missing(of i text[*]);
end;
i + 1;
text[i]=_infile_;
if end then output;
run;
Thanks! That also working great. As you said, it is working for texts above 32K chars.
/* T1003210 load a file in one character variable(mystring) of length 128,004.
Associated problem. We really need to what the op wants to do with the
array of 32k chunks.
I load a file in one character variable(mystring) of length 128,004.
This type of problem is best solved with Perl or Python.
I present an R solution because I am a little more skilled in R.
inspired
https://goo.gl/eOu829
https://communities.sas.com/t5/SAS-Procedures/How-to-import-100-TXT-files-to-100-Rows-in-one-dataset/m-p/343429
HAVE (A file with two 64000 byte records)
=========================================
FILE: d:/txt/fyl1
RECORD 1: 64002 bytes first recrord (crlf once)
RECORD 2: 128004 bytes full file (crlf twice)
WANT ( load the the file both records into one character variable of length 128,004 bytes)
==========================================================================================
Pull some substrings out
1] "FILE LENGTH"
1] 128004
1] "substr(mystring,60001,60010)"
1] "1234567890"
1] "substr(mystring,30001,30010)"
1] "1234567890"
* _ _ _
_ __ ___ __ _| | _____ __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| | < __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___| \__,_|\__,_|\__\__,_|
;
data _null_;
length txt1 txt2 $32000;
txt1=repeat('1234567890',3199);
txt2=txt1;
do fyls=1 to 1;
fylvar=cats('d:/txt/fyl',put(fyls,1.));
file dummy filevar=fylvar lrecl=64000 recfm=v;
put txt1 +(-1) txt2;
put txt1 +(-1) txt2;
end;
run;quit;
NOTE: 2 records were written to the file DUMMY.
The minimum record length was 64000.
*____
| _ \
| |_) |
| _ <
|_| \_\
;
%utl_submit_r64('
library(readr);
mystring <- read_file("d:/txt/fyl1");
"FILE LENGTH";
nchar(mystring);
"substr(mystring,60001,60010)";
substr(mystring,60001,60010);
"substr(mystring,30001,30010)";
substr(mystring,30001,30010);
');
1] "FILE LENGTH"
1] 128004
1] "substr(mystring,60001,60010)"
1] "1234567890"
1] "substr(mystring,30001,30010)"
1] "1234567890"
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.