Hi,
I am running a SAS program on a cloud (which I believe is a unix/linux system). Basically, I want to import and clean every cvs file in a directory. All the files start with log. I find this support working in my local computer: http://support.sas.com/kb/41/880.html However, when I run it in the cloud, it does not work. My code in the cloud is as follows:
filename DIRLIST pipe 'dir "/scratch/dg/log/SAS/log*.csv" /b ';
data dirlist ;
infile dirlist lrecl=200 truncover;
input file_name $100.;
run;
data edgar.dirlist;set dirlist;run;
data _null_;
set dirlist end=end;
count+1;
call symputx('read'||put(count,4.-l),cats('/scratch/dg/log/SAS/',file_name));
call symputx('dset'||put(count,4.-l),scan(file_name,1,'.'));
if end then call symputx('max',count);
run;
options mprint symbolgen;
%macro readin;
%do i=1 %to &max;
data seclog;
infile "&&read&i" delimiter = ',' MISSOVER
DSD lrecl=32767 firstobs=2 ;
informat ip $15. ;
informat date yymmdd10. ;
informat time anydtdtm40. ;
format ip $15. ;
format date yymmdd10. ;
format time datetime. ;
input
ip $ date time ;
run;
%end;
%mend readin;
%readin;
run;
The error message is as follow:
MPRINT(READIN): infile "/scratch/dg/log/SAS/dir: cannot access '/scratch/dg/log/SAS/log*.csv': No such file
or directory" delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
When I opened the dirlist and I did find that there are two variables: The variable name is file_name. But the value of the variable is
dir: cannot access '/scratch/dg/log/SAS/log*.csv': No such file or directory
I would appreciate it very much if someone can help here.
DIR is a Windows command, not Unix.
Do you have pipe access?
Filename filelist pipe "ls /scratch/dg/log/SAS/log*.csv ";
Data dirList;
Infile filelist truncover;
Input filename $100.;
Run;
However, if all the files have the same layout and you want a single file at the end this is much easier.
data sec_log;
*make sure variables to store file name are long enough;
length filename txt_file_name $256;
informat ip $15. ;
informat date yymmdd10. ;
informat time anydtdtm40. ;
format ip $15. ;
format date yymmdd10. ;
format time datetime. ;
*keep file name from record to record;
retain txt_file_name;
*Use wildcard in input;
infile "'/scratch/dg/log/SAS/log*.csv' " eov=eov filename=filename truncover;
*Input first record and hold line;
input@;
*Check if this is the first record or the first record in a new file;
*If it is, replace the filename with the new file name and move to next line;
if _n_ eq 1 or eov then do;
txt_file_name = scan(filename, -1, "/");
eov=0;
delete;
end;
*Otherwise go to the import step and read the files;
else input ip $ date time ;
run;
If the "/scratch/" folder is not a root for the path you need to provide one.
The Path has to be as the computer running the DIR command sees it.
I would suggest instead a data _null_ and a bunch of Call Symputx in this step
data _null_; set dirlist end=end; count+1; call symputx('read'||put(count,4.-l),cats('/scratch/dg/log/SAS/',file_name)); call symputx('dset'||put(count,4.-l),scan(file_name,1,'.')); if end then call symputx('max',count); run;
to assign the value of the Put(count,4. -l) etc to actual variables. Then look the values of those variables. You might just find some odd values depending of which version of DIR is involve.
The /scratch/ is the root for the path. I have several steps before to import data from a folder in scratch and it works.
Run this for a test:
data files;
length dref $8 name $200;
rc = filename(dref,"/scratch/dg/log/SAS");
did = dopen(dref);
if did
then do;
do i = 1 to dnum(did);
name = dread(did,i);
output;
end;
rc = dclose(did);
end;
else putlog "Directory can't be opened";
rc = filename(dref);
keep name;
run;
See which names, if any, you get, or if you get the error message in the log.
DIR is a Windows command, not Unix.
Do you have pipe access?
Filename filelist pipe "ls /scratch/dg/log/SAS/log*.csv ";
Data dirList;
Infile filelist truncover;
Input filename $100.;
Run;
However, if all the files have the same layout and you want a single file at the end this is much easier.
data sec_log;
*make sure variables to store file name are long enough;
length filename txt_file_name $256;
informat ip $15. ;
informat date yymmdd10. ;
informat time anydtdtm40. ;
format ip $15. ;
format date yymmdd10. ;
format time datetime. ;
*keep file name from record to record;
retain txt_file_name;
*Use wildcard in input;
infile "'/scratch/dg/log/SAS/log*.csv' " eov=eov filename=filename truncover;
*Input first record and hold line;
input@;
*Check if this is the first record or the first record in a new file;
*If it is, replace the filename with the new file name and move to next line;
if _n_ eq 1 or eov then do;
txt_file_name = scan(filename, -1, "/");
eov=0;
delete;
end;
*Otherwise go to the import step and read the files;
else input ip $ date time ;
run;
Sounds like the directory you are trying to search is not on the machine where your SAS code is running.
Also DIR is not really a Unix command, although a lot of unix version have implemented something like it, but it does not support the Windows style /b option.
Example:
>dir test/*log /b dir: cannot access /b: No such file or directory test/aaabatch_test1.log test/endsas.log test/m6.log test/sasver.log test/where_in.log
Here is what you get if the directory does not exist (or you cannot read from it).
>dir /scratch/nosuchdir/*.log /b
dir: cannot access /scratch/nosuchdir/*.log: No such file or directory
dir: cannot access /b: No such file or directory
So take two immediate steps.
1) Remove the /b
2) Figure out whether the directory you are trying to read the files from is actually available on the machine where SAS is running. And if it is what its actual name is.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.