Hello All – I’m a new SAS user and I’d like to know if there is a way to loop through a folder on a network share to identify the files within that folder that meet specific criterion.
For example, the directory path is “S:\Del\ADHOC” and I want to identify the files in that folder which have a naming convention like adhoc*.txt. (* is a wildcard so as long as the file begins with adhoc and the type is .txt then the naming convention is good)
Also, I need to identify that the files are comma delimited and that there are 5 columns formatted as such: Account $20. FieldCode $6. FieldValue $40. CurRec $4. Current $5.
I want to create a dataset that contains the names of each file contained within the directory and set a value Y or N if it met the criterion.
Is there a way to do this and if so can someone please assist with the code?
Message was edited by: FlyTime88
Yes, look at using FILENAME statement with PIPES, where you can supply a suitable command in quote marks for the OS to execute, such as an "ls" or "dir".
This topic has been discussed in these forums and also you will find sample material / content at the SAS support http://support.sas.com/ website. Also, the SAS support website has SAS-hosted product documentation and there will be a useful "companion" guide for your particular OS environment.
Suggested Google advanced search argument, this topic / post:
filename pipes os command read directory folder site:sas.com
How far should validation of the criteria go?
Reject if there are fewer columns?
Warn if extra columns are present?
Column headers implying data present but out of order?
Column widths - too wide or narrow?
Should 10/100/1000/all rows be tested?
Thank you all for your feedback, I appreciate your time and responses!
I’ve read through the suggested links but I struggled with it since I’m beginner.
I also ran the code that KSharp posted and it sort of gave me the results that I wanted but I failed to mention that I need the code to loop through each record in the file to confirm that each record is formatted as I mentioned above.
However, rather than going down that road it may be easier if I elaborate a little more.
Today, the below code runs which creates the adhoc_imports dataset.
Is there a way to modify this script to also create a dataset that has the file name that was being imported and within that dataset have a value that sets to Y or N depending upon if the file was imported? I really just need an audit trail identifying which files were written to adhoc_imports and those that were not.
data WORK.adhoc_imports ;
%let _EFIERR_ = 0;
infile "C:\adhoc*.txt" delimiter = "," MISSOVER DSD LRECL=350 firstobs=2 ;
informat Account $20. ;
informat FieldCode $6. ;
informat FieldValue $40. ;
informat CurRec $4. ;
informat Current $5. ;
format Account $20. ;
format FieldCode $6. ;
format FieldValue $40. ;
format CurRec $4. ;
format Current $5. ;
Current $ ;
if _ERROR_ then call symputx('_EFIERR_',1);
> Thank you all for your feedback, I appreciate your
> time and responses!
> However, rather than going down that road it may be
> easier if I elaborate a little more.
> Is there a way to modify this script to also create a
> dataset that has the file name that was being
> imported and within that dataset have a value that
> sets to Y or N depending upon if the file was
> imported? I really just need an audit trail
> identifying which files were written to adhoc_imports
> and those that were not.
> Thank you
> data WORK.adhoc_imports( compress=yes ) ;
> %let _EFIERR_ = 0;
length fn filename $1024 ; * two holders. One for infile option and one to "keep" ;
retain filename ;
infile "C:\adhoc*.txt" MISSOVER DSD LRECL=350 firstobs=2 filename=fn eov=eov ;
* establish column lengths / types ;
length Account $20 FieldCode $6 FieldValue $40 CurRec $4 Current $5 ;
input @ ; * load a buffer to check for file change ;
if _n_ = 1 or eov then do ;
filename= fn ; * collect file name ;
**** and take any other action for the file name ;
eov = 0 ; * clear "file changed" flag at same time ;
error_count = 0 ;
input Account -- Current ; * without trailing @ this releases the input buffer;
if _ERROR_ then ERRor_count +1 ;
if length(scan(row,1,',')) ne 2 or
length(scan(row,2,',')) ne 6 or
length(scan(row,3,',')) ne 4 or
length(scan(row,4,',')) ne 4 or
length(scan(row,5,',')) ne 5
if last then do;
proc print noobs;run;
_filename row success
Adhoc_dsdq.txt 12 498983 duwr 49dw 98394 N
Adhoc_ffgfq.txt 22,438983,dfsr,efrw,92354 Y
Adhoc_wq.txt 222,438983,dfsr,efrw,92354 N
Adhoc_wsaa.txt 12 498983 duwr 49dw 98394 N