DATA Step, Macro, Functions and more

How to collect files with a collection rule

Accepted Solution Solved
Reply
Contributor
Posts: 40
Accepted Solution

How to collect files with a collection rule

I have three folders (say, folder1, folder2, and folder3) that contain some dbf format files.

Suppose that I have:

0345.dbf, 2345.dbf, 4444.dbf, 6767.dbf, 7623.dbf in folder1

0222.dbf, 0345.dbf in folder2

3423.dbf, 4827.dbf, 5635.dbf, 7623.dbf in folder3

So, for example, "0345.dbf"s are shown folder1 and folder2. Similarly, there are two "7623.dbf"s in folder1 and folder3. They all have the variable X, and Y. But they are all different files with different contents.

I have an additional information stored in a csv file (list.csv) with two columns as follows:

0222 f2

0345 f2

2345 f1

3423 f3

4444 f1

4827 f3

5635 f3

6767 f1

7623 f3

According to that rule, I have to pick "0345.dbf" from the folder2, not from the folder1. And "7623.dbf" from the folder3, not from the folder1.

I need to collect all the nine files and save them to a folder(say, "d:/collect") following the selection rule written in the csv file above. Then, I want to do a regression for each nine dbf file:

proc reg; model Y=X; run;

How can I implement this task in SAS? Thanks!


Accepted Solutions
Solution
‎05-17-2012 01:02 PM
Super Contributor
Posts: 1,636

Re: How to collect files with a collection rule


try:


data list;
input fname $ id $;
cards;
1122 bg
1620 bg
1660 bg
0520 tz
0240 tr
0280 tr
0560 tr
;run;

data have;
input o fname $   id $;
cards;

                        1     1122     bg   
                        2     1620     bg   
                        3     1660     bg   
                        4     0240     tr   
                        5     0280     tr   
                        6     0560     tr   
                        7     0520     tz   
                        8     1122     tz 
      ;
     

proc sql;
create table final_selection as
select a.*
from have a,list b
where a.fname=b.fname and a.id=b.id;
  quit;
proc print;
run;
                                       Obs    o    fname    id

                                        1     1    1122     bg
                                        2     2    1620     bg
                                        3     3    1660     bg
                                        4     4    0240     tr
                                        5     5    0280     tr
                                        6     6    0560     tr
                                        7     7    0520     tz

View solution in original post


All Replies
Super User
Posts: 19,855

Re: How to collect files with a collection rule

How many files do you have? You say all have X and Y, but are they the same formats? Ie is X the same type of variable in all?

Although it might be inefficient, you could just read all with the folder source and file name using filename or indsname options in the input or set statement.

Then filter through what you need with a merge file.

Contributor
Posts: 40

Re: How to collect files with a collection rule

The X, Y variables are numeric variables. There are lots of dbf files in many folders. I have to automate this process. Thank you.

Super User
Posts: 5,516

Re: How to collect files with a collection rule

I would suggest you create 3 SAS data sets:  folder1, folder2, and folder3.  If you have separate SAS data sets for each dbf file, they can still be combined.  For example:

data folder2;

   set f2_0222 (in=in1)

        f2_0345 (in=in2);

   if in1 then source='0222';

   else if in2 then source='0345';

run;

Having created 3 SAS data sets, I would transfer the control file to a SAS data set, and create a list of records to extract.  For example, assuming the SAS data set CONTROL_FILE contains DBF_FILE and FOLDER as the variable names:

proc sql noprint;

   select quote(dbf_file) into : f1 separated by ', '  from control_file where folder="f1";

   select quote(dbf_file) into : f2 separated by ', '  from control_file where folder="f2";

   select quote(dbf_file) into : f3 separated by ', '  from control_file where folder="f3";

quit;

This will create three macro variables where &F2, for example, will look like this:

"0222", "0345"

Finally, extract the proper records:

data analysis;

   set folder1 (where = (source in (&F1)))

        folder2 (where = (source in (&F2)))

        folder3 (where = (source in (&F3)));

run;

Good luck.  Check back if you get stuck at some point.

Super User
Posts: 5,516

Re: How to collect files with a collection rule

Do you know how to read a dbf file into SAS?  Once you take it that far, you'll get plenty of help with the rest of the code.

Super User
Posts: 10,041

Re: How to collect files with a collection rule

I would like to take the power of OS command.

Assuming c:\test\ contains floder1 floder2 ............

data list;
input fname $ dname $;
id=input(compress(dname,,'kd'),best8.);
cards;
0222 f2
0345 f2
2345 f1
3423 f3
4444 f1
4827 f3
5635 f3
6767 f1
7623 f3
;
run;
filename x pipe 'dir c:\test\*.txt /s /b';
data have;
infile x;
input path $100.;
fname=scan(path,-2,'.\');
id=input(compress(scan(path,-3,'.\'),,'kd'),best8.);
run;
proc sql;
 select path 
  from have 
   where fname in (select fname from list) and
         id in (select id from list);
quit;

Ksharp

Contributor
Posts: 40

Re: How to collect files with a collection rule

filename x pipe 'dir c:\test\*.txt /s /b';

This statement does not seem to collect all the path+files if files with same name are scattered across the folder1 - folder3.

Super Contributor
Posts: 1,636

Re: How to collect files with a collection rule

changing

filename x pipe 'dir c:\test\*.txt /s /b';

to

filename x pipe 'dir c:\test\*.dbf /s /b';

Contributor
Posts: 40

Re: How to collect files with a collection rule

Thank you for the reply. I already changed it. I check the Ksharp's pipe statement does not collect all the path+dbf information. If I have 0345.dbf both in folder1 and in folder3, the pipe command reads only one of two a.dbf files. What we have to do is to create the path variable, which has data values as follows:

...

c:\test\folder1\0345.dbf

...

c:\test\folder3\0345.dbf

...

But the pipe fails to collect both.

*update*

And I found that the pipe command does not collect all the dbf files in a single folder. For example, my folder1 has 13 dbf files. But after running the pipe command just for the folder1 (filename x pipe 'dir C:\test\folder1\*.dbf /s /b'Smiley Wink, I have only 6 records in the path variable.

Super Contributor
Posts: 1,636

Re: How to collect files with a collection rule

Hi,

I created three folders under c:\temp\forum and saved several .txt files.

%macro test;

%do i=1 %to 3;

filename x pipe "dir /b c:\temp\forum\folder&i.\*.txt " lrecl=100;

data folder&i ;

     folder=cats('f',&i);

     length fname $20;

     infile x length=reclen ;

     input fname $varying20. reclen ;

run;

%end;

%mend;

%test

data want;

  set folder:;

run;

proc print;run;

                            obs folder     fname

                              1       f1      123.txt

                              2       f1      345.txt

                              3       f2      123.txt

                              4       f2      888.txt

                              5       f3      123.txt

                              6       f3      888.txt

Super User
Posts: 10,041

Re: How to collect files with a collection rule

I tested it again. No problem.

c:\test\ is parent directory .

Using command

dir c:\test\*.dbf /s /b

can list all of dbf file under test directory ,include its sub-directory.

Ksharp

Super Contributor
Posts: 1,636

Re: How to collect files with a collection rule

Hi Ksharp,

when I ran the code below, it only gave me the last .txt file in a folder.

filename x pipe 'dir c:\temp\*.txt /s /b';

data have;

infile x;

input path $100.;

fname=scan(path,-2,'.\');

*id=input(compress(scan(path,-3,'.\'),,'kd'),best8.);

run;

Thanks - Linlin

Super User
Super User
Posts: 7,074

Re: How to collect files with a collection rule

Try just seeing what your command is generating.  If you are running interactive SAS just issue the command INCLUDE X from the command line.  Otherwise run this simple data step.

data _null_; infile x; input; list; run;

Super User
Posts: 10,041

Re: How to collect files with a collection rule

Ha. I make a mistake here , I must add Truncover option to avoid the wrong result.

data list;
input fname $ dname $;
id=input(compress(dname,,'kd'),best8.);
cards;
0222 f2
0345 f2
2345 f1
3423 f3
4444 f1
4827 f3
5635 f3
6767 f1
7623 f3
;
run;
filename x pipe 'dir c:\test\*.txt /s /b';
data have;
infile x truncover;
input path $100.;
fname=scan(path,-2,'.\');
id=input(compress(scan(path,-3,'.\'),,'kd'),best8.);
run;
proc sql;
 select path 
  from have 
   where fname in (select fname from list) and
         id in (select id from list);
quit;


Ksharp

Contributor
Posts: 40

Re: How to collect files with a collection rule

data list;
input fname $ id $;
cards;
1122 bg
1620 bg
1660 bg
0520 tz
0240 tr
0280 tr
0560 tr
;run;
proc contents data=list;run;

                             Alphabetic List of Variables and Attributes

                                    #    Variable    Type    Len

                                    1    fname       Char      8
                                    2    id          Char      8


filename x pipe 'dir D:\ex\sas_ex\Ksharp\*.dbf /s /b';
data have;
length fname $8. id $8.;
infile x truncover;
input path $100.;
fname=scan(path,-2,'.\');
id = scan(path,5,'\');
run;
proc contents data=have; run;

                             Alphabetic List of Variables and Attributes

                                    #    Variable    Type    Len

                                    1    fname       Char      8
                                    2    id          Char      8
                                    3    path        Char    100


proc print data=have; run;

                        Obs    fname    id                 path

                        1     1122     bg    D:\ex\sas_ex\Ksharp\bg\1122.dbf
                        2     1620     bg    D:\ex\sas_ex\Ksharp\bg\1620.dbf
                        3     1660     bg    D:\ex\sas_ex\Ksharp\bg\1660.dbf
                        4     0240     tr    D:\ex\sas_ex\Ksharp\tr\0240.dbf
                        5     0280     tr    D:\ex\sas_ex\Ksharp\tr\0280.dbf
                        6     0560     tr    D:\ex\sas_ex\Ksharp\tr\0560.dbf
                        7     0520     tz    D:\ex\sas_ex\Ksharp\tz\0520.dbf
                        8     1122     tz    D:\ex\sas_ex\Ksharp\tz\1122.dbf

proc sql;
create table final_selection as
select path
from have
where fname in (select fname from list) and
  id in (select id from list);
quit;

proc print data=final_selection; run;

                               Obs                 path

                                1     D:\ex\sas_ex\Ksharp\bg\1122.dbf
                                2     D:\ex\sas_ex\Ksharp\bg\1620.dbf
                                3     D:\ex\sas_ex\Ksharp\bg\1660.dbf
                                4     D:\ex\sas_ex\Ksharp\tr\0240.dbf
                                5     D:\ex\sas_ex\Ksharp\tr\0280.dbf
                                6     D:\ex\sas_ex\Ksharp\tr\0560.dbf
                                7     D:\ex\sas_ex\Ksharp\tz\0520.dbf
                                8     D:\ex\sas_ex\Ksharp\tz\1122.dbf


I still see the problem here. From the list, fname='1122' is of 'bg'. In the final_selection data set, look at the obs=8: D:\ex\sas_ex\Ksharp\tz\1122.dbf. The whole point of this project is not to include the obs=8. But I still see the '1122.dbf' again from the 'tz' folder!

Am I missing something here? Thanks.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 21 replies
  • 565 views
  • 6 likes
  • 6 in conversation