So the main error is in the approach. Trying to find the directory that a file lives in is going to be impossible because the same filename can appear multiple times in the same directory.
I doubt you need more file attributes than what the %DIRTREE() macro can get from the SAS functions.
But if you do then the LS command WILL tell you the path. The lines that end with colon have the path for the files that follow. You just need to read it and remember it.
Example:
>ls -lR | egrep -e ':$|csv' .: -r--r--r-- 1 xxxxxx group1 20502 Nov 16 2023 csv2ds.sas -rw-r--r-- 1 xxxxxx group1 9395 Jul 2 2008 csvfile.sas -rw-r--r-- 1 xxxxxx group1 3749 Oct 22 2021 csv_na.sas -rw-rw-r-- 1 xxxxxx group1 2096 Jun 12 2021 csv_vnext.sas -r--r--r-- 1 xxxxxx group1 4550 Nov 7 2023 safe_ds2csv.sas ./draft: -rw-rw-r-- 1 xxxxxx group1 14953 Oct 7 2008 csvfile.sas -rw-rw-r-- 1 xxxxxx group1 5964 Jul 24 2006 isam.csv -rw-rw-r-- 1 xxxxxx group1 743 Jul 29 2014 sas2csv.sas ./draft/RCS: ./RCS: -r--r--r-- 1 xxxxxx group1 31049 Nov 16 2023 csv2ds.sas,v -r--r--r-- 1 xxxxxx group1 9545 Jul 2 2008 csvfile.sas,v -r--r--r-- 1 xxxxxx group1 3897 Oct 22 2021 csv_na.sas,v -r--r--r-- 1 xxxxxx group1 5770 Nov 7 2023 safe_ds2csv.sas,v
So something along the lines of :
filename ls temp;
options parmcards=ls;
parmcards4;
.:
-r--r--r-- 1 user1 group1 20502 2023-11-16 10:38 csv2ds.sas
-rw-r--r-- 1 user1 group1 9395 2008-07-02 11:49 csvfile.sas
-rw-r--r-- 1 user1 group1 3749 2021-10-22 13:42 csv_na.sas
-rw-rw-r-- 1 user1 group1 2096 2021-06-12 11:45 csv_vnext.sas
-r--r--r-- 1 user1 group1 4550 2023-11-07 18:55 safe_ds2csv.sas
./draft:
-rw-rw-r-- 1 user1 group1 14953 2008-10-07 13:05 csvfile.sas
-rw-rw-r-- 1 user1 group1 5964 2006-07-24 12:17 isam.csv
-rw-rw-r-- 1 user1 group1 743 2014-07-29 20:42 sas2csv.sas
./draft/RCS:
./RCS:
-r--r--r-- 1 user1 group1 31049 2023-11-16 10:39 csv2ds.sas,v
-r--r--r-- 1 user1 group1 9545 2008-07-02 11:48 csvfile.sas,v
-r--r--r-- 1 user1 group1 3897 2021-10-22 13:42 csv_na.sas,v
-r--r--r-- 1 user1 group1 5770 2023-11-07 18:56 safe_ds2csv.sas,v
;;;;
data files;
length permission $11 links 8 user group $32 size date time 8 filename path $200;
retain path;
infile ls truncover;
input @;
if ':'=char(_infile_,length(_Infile_)) then do;
path = substr(_infile_,1,length(_infile_)-1);
delete;
end;
input permission links user group size date :yymmdd. time :time. filename $char200.;
format date yymmdd10. time tod5. ;
run;
proc print;
run;
When you mentioned that the directory was available with the ls command, you piqued my interest.
So I decided instead of using a script from one of our SAS admins, to read the contents of the infile.
And effectively, you were right.
So here's my SAS code
%let dir=/.../sasdata/;
libname dest1 base "/.../data";
filename oscmd1 pipe "ls -Rla --time-style=long-iso ~ &dir. ";
Data dest1.sasfiles1 ;
length response $1000.;
infile oscmd1 ;
input ;
response= _infile_;
run;
data sasfile2 specialcases;
set dest1.sasfiles1;
if find(response,'total') > 0 then delete;
else if find(response,'cannot access') > 0 then output specialcases;
else if find(response,'?') > 0 then output specialcases;
else if response = '' then delete;
else output sasfile2;
run;
data sasfile2;
set sasfile2 /*(firstobs=1 obs=10)*/;
run;
data files (drop= pos pos2 response);
length permission $11 links pos 8 user group filetype $10 size 8 folderorfname $100 response $1000;
retain path;
set sasfile2;
if ':'=char(response,length(response)) then do;
path = substr(response,1,length(response)-1);
delete;
end;
if '.'=char(response,length(response)) then delete;
pos=find(response,':');
pos2=length(response);
if pos=find(response,':') lt length(response) then
do;
pos=pos+3;
folderorfname =substr(response,pos,length(response)-pos+1);
end;
else
do;
delete;
end;
creation_time=scan(response,7, '');
creation_date=scan(response,6, '');
size=scan(response,5, '');
group=scan(response,4,'');
user=scan(response,3,'');
links=scan(response,2,'');
permission=scan(response,1,'');
if user eq 'root' then delete;
if find(folderorfname,'.') > 0 then filetype = 'file';
else if find(folderorfname,'.') = 0 then filetype = 'folder';
run;
Now that you have figured it out you can collapse your logic into a single data step.
But how important was it to get the information that the %DIRTREE() could not retrieve for you?
Output dataset structure --NAME-- Len Format Description FILENAME $256 Name of file in directory TYPE $1 File or Directory? (F/D) SIZE 8 COMMA20. Filesize in bytes DATE 4 YYMMDD10. Date file last modified TIME 4 TOD8. Time of day file last modified DEPTH 3 Tree depth PATH $256 Directory name
Also keep in mind that UNIX does not keep creation timestamps; only modification times are kept.
If the first creation of a file is important, the relevant software has to keep it in the file or a metadata repository.
You only process 200 files because of this:
data sasfile;
rownum = _n_;
set sasfile (firstobs=1 obs=200);
run;
Good Morning Mr Bremser,
Good point. However, how do you interpret the contents of the notes that goes for the value of the pipe command to the length of the string as below. How do you interpret that. Does it means that the last pipe command did not work fine.
NOTE: The infile DUMMY is:
Pipe command="find /dwh_actuariat/sasdata/ -iname "be_auto_prmaou2013.dpf.00010a49.7.1.spds9" -newermt 2016-07-19 !
-newermt 2016-07-20 2>&1"
20 The SAS System 14:12 Thursday, October 31, 2024
NOTE: The infile DUMMY is:
Pipe command="find /dwh_actuariat/sasdata/ -iname "be_auto_prmaou2013.dpf.00010a49.13.1.spds9" -newermt 2016-07-19 !
-newermt 2016-07-20 2>&1"
NOTE: The infile DUMMY is:
Pipe command="find /dwh_actuariat/sasdata/ -iname "be_auto_prmaou2013.dpf.00010a49.2.1.spds9" -newermt 2016-07-19 !
-newermt 2016-07-20 2>&1"
NOTE: The infile DUMMY is:
Pipe command="find /dwh_actuariat/sasdata/ -iname "be_auto_prmaou2013.dpf.00010a49.6.1.spds9" -newermt 2016-07-19 !
-newermt 2016-07-20 2>&1"
NOTE: The infile DUMMY is:
Pipe command="find /dwh_actuariat/sasdata/ -iname "be_auto_prmaou2013.dpf.00010a49.8.1.spds9" -newermt 2016-07-19 !
-newermt 2016-07-20 2>&1"
NOTE: The infile DUMMY is:
Pipe command="find /dwh_actuariat/sasdata/ -iname "be_auto_prmaou2013.dpf.00010a49.0.1.spds9" -newermt 2016-07-19 !
-newermt 2016-07-20 2>&1"
NOTE: 29 records were read from the infile DUMMY.
The minimum record length was 80.
The maximum record length was 111.
NOTE: 29 records were read from the infile DUMMY.
The minimum record length was 89.
The maximum record length was 111.
NOTE: 29 records were read from the infile DUMMY.
The minimum record length was 80.
The maximum record length was 111.
NOTE: 29 records were read from the infile DUMMY.
The DATA step will first list all different values of the FILEVAR to the log, and then report the statistics for each separate read, like this:
data files;
do i = 1 to 2;
fname = "~/import/x" !! put(i,z3.) !! ".txt";
output;
end;
run;
data _null_;
set files;
file dummy dsd filevar=fname;
do i = 1 to num;
set sashelp.class point=i nobs=num;
put name sex age;
end;
run;
data all;
set files;
infile dummy dsd filevar=fname end=eof;
do until (eof);
input name $ sex $ age;
output;
end;
run;
Log:
69 data files; 70 do i = 1 to 2; 71 fname = "~/import/x" !! put(i,z3.) !! ".txt"; 72 output; 73 end; 74 run; NOTE: The data set WORK.FILES has 2 observations and 2 variables. NOTE: Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 662.46k OS Memory 20132.00k Timestamp 01.11.2024 02:07:44 nachm. Step Count 38 Switch Count 2 Page Faults 0 Page Reclaims 120 Page Swaps 0 Voluntary Context Switches 10 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 75 76 data _null_; 77 set files; 78 file dummy dsd filevar=fname; 79 do i = 1 to num; 80 set sashelp.class point=i nobs=num; 81 put name sex age; 82 end; 83 run; NOTE: The variable i exists on an input data set, but was also specified in an I/O statement option. The variable will not be included on any output data set. NOTE: The variable fname exists on an input data set, but was also specified in an I/O statement option. The variable will not be included on any output data set. NOTE: The file DUMMY is: Dateiname=/home/kurt.bremser/import/x001.txt, Besitzername=kurt.bremser,Gruppenname=oda, Zugriffsberechtigung=-rw-r--r--, Zuletzt geändert=01. November 2024 15.07 Uhr NOTE: The file DUMMY is: Dateiname=/home/kurt.bremser/import/x002.txt, Besitzername=kurt.bremser,Gruppenname=oda, Zugriffsberechtigung=-rw-r--r--, Zuletzt geändert=01. November 2024 15.07 Uhr NOTE: 19 records were written to the file DUMMY. The minimum record length was 9. The maximum record length was 12. NOTE: 19 records were written to the file DUMMY. The minimum record length was 9. The maximum record length was 12. NOTE: There were 2 observations read from the data set WORK.FILES. NOTE: Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit): real time 0.01 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 916.25k OS Memory 20388.00k Timestamp 01.11.2024 02:07:44 nachm. Step Count 39 Switch Count 0 Page Faults 0 Page Reclaims 89 Page Swaps 0 Voluntary Context Switches 17 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 16 84 85 data all; 86 set files; 87 infile dummy dsd filevar=fname end=eof; 88 do until (eof); 89 input name $ sex $ age; 90 output; 91 end; 92 run; NOTE: The variable fname exists on an input data set, but was also specified in an I/O statement option. The variable will not be included on any output data set. NOTE: The infile DUMMY is: Dateiname=/home/kurt.bremser/import/x001.txt, Besitzername=kurt.bremser,Gruppenname=oda, Zugriffsberechtigung=-rw-r--r--, Zuletzt geändert=01. November 2024 15.07 Uhr, Dateigröße (Byte)=217 NOTE: The infile DUMMY is: Dateiname=/home/kurt.bremser/import/x002.txt, Besitzername=kurt.bremser,Gruppenname=oda, Zugriffsberechtigung=-rw-r--r--, Zuletzt geändert=01. November 2024 15.07 Uhr, Dateigröße (Byte)=217 NOTE: 19 records were read from the infile DUMMY. The minimum record length was 9. The maximum record length was 12. NOTE: 19 records were read from the infile DUMMY. The minimum record length was 9. The maximum record length was 12. NOTE: There were 2 observations read from the data set WORK.FILES. NOTE: The data set WORK.ALL has 38 observations and 4 variables.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.