BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hello All – I’m a new SAS user and I’d like to know if there is a way to loop through a folder on a network share to identify the files within that folder that meet specific criterion.

For example, the directory path is “S:\Del\ADHOC” and I want to identify the files in that folder which have a naming convention like adhoc*.txt. (* is a wildcard so as long as the file begins with adhoc and the type is .txt then the naming convention is good)

Also, I need to identify that the files are comma delimited and that there are 5 columns formatted as such: Account $20. FieldCode $6. FieldValue $40. CurRec $4. Current $5.

I want to create a dataset that contains the names of each file contained within the directory and set a value Y or N if it met the criterion.

Is there a way to do this and if so can someone please assist with the code?

Thank you


Example Dataset:

File_Nm Success
Adhoc_Test.txt Y
Adhoc_test1.txt N
Adhoc_test3.xls N
Adhoc_Test4.txt Y Message was edited by: FlyTime88
8 REPLIES 8
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Yes, look at using FILENAME statement with PIPES, where you can supply a suitable command in quote marks for the OS to execute, such as an "ls" or "dir".

This topic has been discussed in these forums and also you will find sample material / content at the SAS support http://support.sas.com/ website. Also, the SAS support website has SAS-hosted product documentation and there will be a useful "companion" guide for your particular OS environment.

Scott Barry
SBBWorks, Inc.

Suggested Google advanced search argument, this topic / post:

filename pipes os command read directory folder site:sas.com
Oleg_L
Obsidian | Level 7
Hello.
As Mr. Barry mentioned this question has been discussed many times.
There is one of these discussions:

http://support.sas.com/forums/message.jspa?messageID=51742#51742
>
http://support.sas.com/forums/message.jspa?messageID=51742#51742
Message was edited by: Oleg_L
Ksharp
Super User
Give you an example.



[pre]
%let path=c:\temp\;
filename fname pipe "dir &path.adhoc_*.txt /B";
data result;
infile fname length=len;
input filename $varying200. len;
filename=cats("&path",filename);

infile dummy filevar=filename length=len end=last;
_filename=scan(filename,-1,'\');
input row $varying200. len;

if length(scan(row,1,',')) eq 2 and
length(scan(row,2,',')) eq 6 and
length(scan(row,3,',')) eq 4 and
length(scan(row,4,',')) eq 4 and
length(scan(row,5,',')) eq 5
then success='Y';
else success='N';

run;

proc print noobs;run;











_filename row success

Adhoc_dsdq.txt 12 498983 duwr 49dw 98394 N
Adhoc_ffgfq.txt 12,498983,duwr,49dw,98394 Y
Adhoc_wq.txt 12,498983,duwr,49dw,98394 Y
Adhoc_wsaa.txt 12 498983 duwr 49dw 98394 N

[/pre]





Ksharp
Peter_C
Rhodochrosite | Level 12
FlyTime88

How far should validation of the criteria go?
Reject if there are fewer columns?
Warn if extra columns are present?
Column headers implying data present but out of order?
Column widths - too wide or narrow?
Should 10/100/1000/all rows be tested?

peterC
deleted_user
Not applicable
Thank you all for your feedback, I appreciate your time and responses!

I’ve read through the suggested links but I struggled with it since I’m beginner.

I also ran the code that KSharp posted and it sort of gave me the results that I wanted but I failed to mention that I need the code to loop through each record in the file to confirm that each record is formatted as I mentioned above.

However, rather than going down that road it may be easier if I elaborate a little more.

Today, the below code runs which creates the adhoc_imports dataset.

Is there a way to modify this script to also create a dataset that has the file name that was being imported and within that dataset have a value that sets to Y or N depending upon if the file was imported? I really just need an audit trail identifying which files were written to adhoc_imports and those that were not.

Thank you


data WORK.adhoc_imports ;
%let _EFIERR_ = 0;
infile "C:\adhoc*.txt" delimiter = "," MISSOVER DSD LRECL=350 firstobs=2 ;
informat Account $20. ;
informat FieldCode $6. ;
informat FieldValue $40. ;
informat CurRec $4. ;
informat Current $5. ;
format Account $20. ;
format FieldCode $6. ;
format FieldValue $40. ;
format CurRec $4. ;
format Current $5. ;
input
Account $
FieldCode $
FieldValue $
CurRec $
Current $ ;
if _ERROR_ then call symputx('_EFIERR_',1);
run;
Peter_C
Rhodochrosite | Level 12
> Thank you all for your feedback, I appreciate your
> time and responses!
>

> However, rather than going down that road it may be
> easier if I elaborate a little more.
>

>
> Is there a way to modify this script to also create a
> dataset that has the file name that was being
> imported and within that dataset have a value that
> sets to Y or N depending upon if the file was
> imported? I really just need an audit trail
> identifying which files were written to adhoc_imports
> and those that were not.
>
> Thank you
>
>
> data WORK.adhoc_imports( compress=yes ) ;
>
> %let _EFIERR_ = 0;
>
length fn filename $1024 ; * two holders. One for infile option and one to "keep" ;
retain filename ;
infile "C:\adhoc*.txt" MISSOVER DSD LRECL=350 firstobs=2 filename=fn eov=eov ;
* establish column lengths / types ;
length Account $20 FieldCode $6 FieldValue $40 CurRec $4 Current $5 ;
input @ ; * load a buffer to check for file change ;
if _n_ = 1 or eov then do ;
filename= fn ; * collect file name ;
**** and take any other action for the file name ;
eov = 0 ; * clear "file changed" flag at same time ;
error_count = 0 ;
end ;

input Account -- Current ; * without trailing @ this releases the input buffer;
if _ERROR_ then ERRor_count +1 ;
run ;
Ksharp
Super User
Ksharp

Message was edited by: Ksharp Message was edited by: Ksharp
Ksharp
Super User
Opps.
If you want all rows have the same length for each variable.Try the following, You need to
change it a little,That is easy.


[pre]
%let path=c:\temp\;
filename fname pipe "dir &path.adhoc_*.txt /B";
data result;
infile fname length=len;
input filename $varying200. len;
filename=cats("&path",filename);

infile dummy filevar=filename length=len end=last;
_filename=scan(filename,-1,'\');
do until(last);
input row $varying200. len;

if length(scan(row,1,',')) ne 2 or
length(scan(row,2,',')) ne 6 or
length(scan(row,3,',')) ne 4 or
length(scan(row,4,',')) ne 4 or
length(scan(row,5,',')) ne 5
then do;
success='N';
output;
leave;
end;

if last then do;
success='Y';
output;
end;
end;

run;

proc print noobs;run;








_filename row success

Adhoc_dsdq.txt 12 498983 duwr 49dw 98394 N
Adhoc_ffgfq.txt 22,438983,dfsr,efrw,92354 Y
Adhoc_wq.txt 222,438983,dfsr,efrw,92354 N
Adhoc_wsaa.txt 12 498983 duwr 49dw 98394 N

[/pre]



Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 902 views
  • 0 likes
  • 5 in conversation