Desktop productivity for business analysts and programmers

Get the name of the file an observation came from when the filename has multiple files

Reply
Contributor
Posts: 25

Get the name of the file an observation came from when the filename has multiple files

I have a filename that has multiple files listed in it, like this example;

 

     filename filesin ("%CMPRES(&folder\First File.txt)" "%CMPRES(&folder\Second File.txt)" "%CMPRES(&folder\Third File.txt)");

 

Then my datastep;

 

     data allset;

         infile filesin;

              input @1 intext $200.;

     run;

 

The question is how can I set a variable in the datastep that contains the name of the input file?

 

* I found I needed the %CMPRES to allow Windows to get the correct name.

 

 

Grand Advisor
Posts: 17,396

Re: Get the name of the file an observation came from when the filename has multiple files

Have you considered using the filevar and/or filename options instead? You'd put the list of file names into a dataset and then use that as a parameter to filename option, which could then be stored on the dataset.


Contributor
Posts: 25

Re: Get the name of the file an observation came from when the filename has multiple files

Correct me if I'm wrong but doesn't processing stop at the end of the smallest input file? Some of mine are long some are short but they are all different sizes.

Grand Advisor
Posts: 17,396

Re: Get the name of the file an observation came from when the filename has multiple files

[ Edited ]

You're wrong Smiley Happy

 

PS. I was trying to find an example, not just trying to be snarky Smiley Wink

Contributor
Posts: 25

Re: Get the name of the file an observation came from when the filename has multiple files

I'm OK with being wrong. But I don't see how to do it.

Grand Advisor
Posts: 17,396

Re: Get the name of the file an observation came from when the filename has multiple files

[ Edited ]

Here's an example from the SAS Knowledge Base.

http://support.sas.com/kb/24/712.html

Add this line to after the infile statement to keep the file name. Note that all records are read, not just the smallest amount.

file_name=fil2read;

 

EDIT: Full code here, tested on SAS 9.3

/* Create external file EXTFILE1 */  
 
data _null_;
  file 'c:\_localdata\temp\extfile1.txt';
  put "05JAN2001 6 W12301 1.59 9.54";                                           
  put "12JAN2001 3 P01219 2.99 8.97";                                            
  put "16JAN2001 1 A00101 3.00 3.00";                                            
  put "19JAN2001 3 A00101 3.00 9.00";                                            
  put "24JAN2001 2 B90035 2.59 5.18";                                            
run;                                                                        
                                                                      
                                                                        
/* Create external file EXTFILE2 */   

data _null_;
  file 'c:\_localdata\temp\extfile2.txt'; 
  put "02FEB2001 1 P01219 2.99 2.99";                                        
  put "05FEB2001 3 A00901 1.99 5.97";                                        
  put "07FEB2001 2 C21135 3.00 6.00";                                        
  put "14FEB2001 7 B90035 2.59 18.13";                                       
  put "20FEB2001 6 A00901 1.99 11.94";                                       
  put "27FEB2001 1 W12301 1.59 1.59";                                        
  put "27FEB2001 2 C00300 1.00 2.00";                                        
  put "28FEB2001 2 B90035 2.59 5.18";
run; 

/*  Create the external file EXTFILE3 */  
                                                                    
data _null_;
  file 'c:\_localdata\temp\extfile3.txt';
  put "06MAR2001 4 A00101 3.59 14.36";                                      
  put "12MAR2001 2 P01219 2.99 5.98";                                        
  put "13MAR2001 2 A00101 3.00 6.00";                                        
  put "16MAR2001 3 B90035 2.59 7.77";                                        
  put "16MAR2001 1 W99201 5.50 5.50";                                       
  put "21MAR2001 3 C30660 2.00 6.00";                                        
  put "29MAR2001 5 A00901 1.99 9.95";                                        
run;  

/* Path to files to be read are in the DATALINES.               */
/* Each file is read in turn with the same INPUT statement.     */
/* The END= variable is set to 1 each time the DATA step        */
/* comes to the end of a file.                                  */
/*                                                              */
/* Read the name of the file to be read from the DATALINES and  */
/* store it in FIL2READ.  The file is then read in the DO WHILE */
/* loop.  At the end of the file, the DO loop ends, control     */
/* passes back to the top of DATA step and the process starts   */
/* over again until all files have been read.                   */
/*                                                              */
/* The argument "dummy" in the INFILE statement is a place-     */
/* holder used in place of a file reference.                    */
           
data one;
  infile datalines;

  /* Ensure fully qualified path will fit in FIL2READ */
  length fil2read $40;

  /* Input path of file to be read from DATALINES */
  input fil2read $;

  infile dummy filevar=fil2read end=done;
  file_name=fil2read;
  do while(not done);

    /* Input statement for files to be read */
    input @1 date date9. @11 quanity item $ price totcost;  
    output;
  end;      
datalines;
c:\_localdata\temp\extfile1.txt
c:\_localdata\temp\extfile2.txt
c:\_localdata\temp\extfile3.txt
;          
            
proc print data=one;
run;

 

Contributor
Posts: 25

Re: Get the name of the file an observation came from when the filename has multiple files

A complication is tht the file names have spaces in them. So for example one of them is 

 

      Empty Model Prelim

 

Should I enclose the names in tics (") to start with or how do I handle that?

I tried surrounding the names with ticks, that didn't work. Other ideas?

Grand Advisor
Posts: 17,396

Re: Get the name of the file an observation came from when the filename has multiple files

You need to change the input statement for the file names then, in this case it reads 40 characters, including spaces.

input fil2read $40.;
Contributor
Posts: 25

Re: Get the name of the file an observation came from when the filename has multiple files

 

I got it to work like you see below. I get a lost card note in the log which I can't yet explain but I'm looking at it now.

If anyone sees a cleaner solution please let me know. And thank you everyone for the input.


%let folder = \\kcpublic\PUBLIC\Public\SAS\Costing Run Logs\Costing Run Logs 09-10\ ;
filename indsn "&folder";
data one;
infile datalines truncover;
length fil2read $150;
input fil2read $char150.;
fil2read = cats("&folder",fil2read);
infile dummy filevar = fil2read end = done;
file_name = fil2read;
do while(not done);
input @1 intext $char100.;
output;
end;
datalines;
Empty Model Prelim Project Log.txt
Empty Model Previous Project Log.txt
Profman Containers to Cars Marketing Prelim Ordered List Project Log.txt
Profman Containers to Cars Marketing Previous Ordered List Project Log.txt
Profman Containers to Cars Prelim Ordered List Project Log.txt
Profman Containers to Cars Previous Ordered List Project Log.txt
Profman Output Marketing Prelim Data to SAS Prm Stg Project Log.txt
Profman Output to SAS PRM STG Prelim Project Log.txt
Profman Output to SAS Prm Stg Previous Project Log.txt
;
run;

Respected Advisor
Posts: 3,124

Re: Get the name of the file an observation came from when the filename has multiple files

Some answers from this thread maybe relevant to you, @Reeza's suggestion was also mentioned.

https://communities.sas.com/t5/Base-SAS-Programming/txt-files-I-need-to-pattern-match-contents-on-wi...

Ask a Question
Discussion stats
  • 9 replies
  • 673 views
  • 1 like
  • 3 in conversation