Hello, I'm trying to read in a directory of text files into a data set that would hold all file names in one variable and the entire content of the respective text file in another variable. Let's consider this example:
1. A directory on my server /data/textfiles/ has this contents:
textfile1.txt
textfile2.txt
textfile3.txt
2. I would like to create a dataset that looks like this:
fname
content
1
textfile1.txt
This is the entire text in this file. Line breaks might be deleted or replaced with special characters.
2
textfile2.txt
This is the entire text in this file. Line breaks might be deleted or replaced with special characters.
3
textfile3.txt
This is the entire text in this file. Line breaks might be deleted or replaced with special characters.
I've tried to program this with dread(), fread(), fget() and so on but haven't been successful.
%let directory=/data/textfiles/
data files;
error_dir = filename(fref,"&directory");
dir_id = dopen(fref);
do i = 1 to dnum(dir_id);
fname = dread(dir_id,i);
fpath = cat("&directory./",fname);
error_file = filename("thefile",fpath);
file_id = fopen("thefile");
fread_error = fread(file_id);
fget_error = fget(file_id,content);
fclose_error = fclose(file_id);
output;
end;
dclose_error = dclose(dir_id);
keep fname content;
run;
However, what I'm getting is just the first few characters of each file, in my impression it's always the first line, i. e. line breaks are treated as separators and fget() only takes the first column from each opened file. The documentation for fget() is pretty thin and I don't see how to change the way data are written to the dataset from the file.
... View more