Help using Base SAS procedures

How to import 100 TXT files to 100 Rows in one dataset?

Accepted Solution Solved
Reply
Contributor
Posts: 23
Accepted Solution

How to import 100 TXT files to 100 Rows in one dataset?

Greetings,

 

1. I would like to import 100 TXT files to 100 rows in one dataset.

2. To read 1 file to 1 row in 1 Dataset, I can use the following code:

 

data test;
length text $32767;
retain text '';
infile "C:\Test.txt" flowover dlmstr='//' end=last;
input;
text=cats(text,_infile_);
if last then output;
run;

 

3. However, I would like to run import 100 TXT files (from the same directory) to 100 rows in one dataset.

 

Thanks in advance!

 

D


Accepted Solutions
Solution
‎03-22-2017 03:10 PM
Respected Advisor
Posts: 3,777

Re: How to import 100 TXT files to 100 Rows in one dataset?

This use wildcard in the INFILE statement.

 

I added a variable file to show where each record comes from.  DLMSTR doesn't seem to do anything.  Are you trying to keep the TERMSTR on the records in TEXT?

 

data file;
   length file filenm $64;
   retain file;
   infile '~/t00*.dat' eov=eov end=end dlmstr='//' flowover filename=filenm;
   length text $32767;
   retain text;
   input;
   if _n_ eq 1 or eov then do;      
      if eov then output;
      file=filenm;
      eov=0;
      call missing(text);
      end;
   text=cats(text,_infile_);
   if end then output;
   run;

View solution in original post


All Replies
Super User
Posts: 10,497

Re: How to import 100 TXT files to 100 Rows in one dataset?

Do your txt files have column headings?

Are you absolutely sure that 32767 characters will hold the entire contents of each file?

 

did you try?

 

infile "C:\path:\*.txt" flowover dlmstr='//' end=last;

 

where path is the path to the folder

 

This will attempt to read all of the files with txt as the extension in the folder. If there is a more specific pattern to the file names you need to extract then more details on the names is needed.

Contributor
Posts: 23

Re: How to import 100 TXT files to 100 Rows in one dataset?

Thanks a lot for your fast reply..Changing the script the way you suggested, imported all the text files to the final row of the dataset, so it is not exactly what I looked for. Thanks anyway for your great explanation Smiley Happy
Solution
‎03-22-2017 03:10 PM
Respected Advisor
Posts: 3,777

Re: How to import 100 TXT files to 100 Rows in one dataset?

This use wildcard in the INFILE statement.

 

I added a variable file to show where each record comes from.  DLMSTR doesn't seem to do anything.  Are you trying to keep the TERMSTR on the records in TEXT?

 

data file;
   length file filenm $64;
   retain file;
   infile '~/t00*.dat' eov=eov end=end dlmstr='//' flowover filename=filenm;
   length text $32767;
   retain text;
   input;
   if _n_ eq 1 or eov then do;      
      if eov then output;
      file=filenm;
      eov=0;
      call missing(text);
      end;
   text=cats(text,_infile_);
   if end then output;
   run;
Contributor
Posts: 23

Re: How to import 100 TXT files to 100 Rows in one dataset?

Thanks so much! This is working just great Smiley Happy Appreciate your fast reply Smiley Happy
Respected Advisor
Posts: 3,777

Re: How to import 100 TXT files to 100 Rows in one dataset?

[ Edited ]

This version reads the file 32767 bytes at a time TERMSTRs and all into an array of TEXT variables.  This will capture all text for files that may be larger than 32767.  Adjust array diminsion as needed.

 

data file;
   length file filenm $64;
   retain file;
   array text[10] $32767;
   retain text;
   infile '~/*.dat' eov=eov end=end filename=filenm recfm=F lrecl=32767;
   input;
   *list;
   if _n_ eq 1 or eov then do;      
      if eov then output;
      file=filenm;
      eov=0;
      call missing(of i text[*]);
      end;
   i + 1;
   text[i]=_infile_;
   if end then output;
   run;

 

Contributor
Posts: 23

Re: How to import 100 TXT files to 100 Rows in one dataset?

Thanks! That also working great. As you said, it is working for texts above 32K chars.

Valued Guide
Posts: 505

Re: How to import 100 TXT files to 100 Rows in one dataset?

/* T1003210 load a file in one character variable(mystring) of length 128,004.

Associated problem. We really need to what the op wants to do with the 
array of 32k chunks. 


I load a file in one character variable(mystring) of length 128,004.

This type of problem is best solved with Perl or Python.
I present an R solution because I am a little more skilled in R.

inspired
https://goo.gl/eOu829
https://communities.sas.com/t5/SAS-Procedures/How-to-import-100-TXT-files-to-100-Rows-in-one-dataset/m-p/343429

HAVE (A file with two 64000 byte records)
=========================================

FILE:  d:/txt/fyl1

RECORD 1:    64002   bytes  first recrord  (crlf once)
RECORD 2:   128004   bytes  full file (crlf twice)

WANT ( load the the file both records into one character variable of length 128,004 bytes)
==========================================================================================

Pull some substrings out

   1] "FILE LENGTH"
   1] 128004

   1] "substr(mystring,60001,60010)"
   1] "1234567890"

   1] "substr(mystring,30001,30010)"
   1] "1234567890"

*                _                  _       _
 _ __ ___   __ _| | _____        __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| |   <  __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___|      \__,_|\__,_|\__\__,_|

;

data _null_;
  length txt1 txt2 $32000;
  txt1=repeat('1234567890',3199);
  txt2=txt1;
  do fyls=1 to 1;
    fylvar=cats('d:/txt/fyl',put(fyls,1.));
    file dummy filevar=fylvar lrecl=64000 recfm=v;
    put txt1 +(-1) txt2;
    put txt1 +(-1) txt2;
  end;
run;quit;

NOTE: 2 records were written to the file DUMMY.
      The minimum record length was 64000.

*____
|  _ \
| |_) |
|  _ <
|_| \_\

;


%utl_submit_r64('
library(readr);
mystring <- read_file("d:/txt/fyl1");
"FILE LENGTH";
nchar(mystring);
"substr(mystring,60001,60010)";
substr(mystring,60001,60010);
"substr(mystring,30001,30010)";
substr(mystring,30001,30010);
');


1] "FILE LENGTH"
1] 128004

1] "substr(mystring,60001,60010)"
1] "1234567890"

1] "substr(mystring,30001,30010)"
1] "1234567890"
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 605 views
  • 1 like
  • 4 in conversation