DATA Step, Macro, Functions and more

Creating a data set using files with the same name in many directories

Accepted Solution Solved
Reply
Contributor
Posts: 37
Accepted Solution

Creating a data set using files with the same name in many directories

Hello,

I have a data set, call it myds.sas7bdat . I have a two-level directory structure /location/date , and each location/date subdirectory (e.g. ACY/Dec 30) has a myds data set in it. I would like to merge all these data sets. In addition, since there's no location info in the files, I'd like to add a location variable during the merge. Can anyone suggest a way to do this? Macro or data step programming is fine.

Thanks much, Bruce


Accepted Solutions
Solution
‎02-09-2015 08:20 PM
Super User
Super User
Posts: 7,039

Re: Creating a data set using files with the same name in many directories

Posted in reply to brucehughw

You can ask DOS to generate the list of file names for you.

You can reference the dataset by using the file's name in quotes in a SET statement.

You can use the INDSNAME option to make the name available to your data step so that you can parse out the directory names.

%let path=c:\downloads ;

data files ;

  infile "dir /b/s &path\*.sas7bdat" pipe truncover ;

  input fname $256.;

run;

proc sql noprint ;

  select quote(trim(fname)) into :flist separated by ' '

    from files

  ;

quit;

data want ;

   length dsname $256 folder memname $32 ;

   set &flist indsname=dsname ;

   folder = scan(dsname,-2,'\.') ;

   memname = scan(dsname,-1,'\.');

run;

View solution in original post


All Replies
PROC Star
Posts: 7,471

Re: Creating a data set using files with the same name in many directories

Posted in reply to brucehughw

What operating system are you on and which version of SAS are you using?

Contributor
Posts: 37

Re: Creating a data set using files with the same name in many directories

Windows 7 and SAS 9.2 . thanks.

Super User
Posts: 11,343

Re: Creating a data set using files with the same name in many directories

Posted in reply to brucehughw

First, confirm that these are all SAS data sets.

PROC Star
Posts: 7,471

Re: Creating a data set using files with the same name in many directories

Posted in reply to brucehughw

Bruce,

I dont have SAS 9.2, thus can only provide a partial solution. The following will get all of the files.

You will likely have to create and apply a format to convert the libname paths to their physical names, as well as strip the dsn the get the paths and apply the format to them.

x "cd c:\";

x "dir dec30.sas7bdat /s /b > c:\temp\temp.txt";

data temp (keep=lname path);

    infile "c:\temp\temp.txt";

    length lname $8;

    length path $150;

    input;

    x=find(_infile_,"\",-200);

    lname=catt('xx',_n_);

    path=substr(_infile_,1,x-1);

    ecmd=cat("libname ",strip(lname),'"',strip(path),'";');

    call execute(ecmd);

run;

proc sql noprint;

  select lname||".dec30"

    into :dpaths

      separated by " "

        from temp

  ;

quit;

data want;

  set &dpaths. indsname=dsn;

run;

Contributor
Posts: 37

Re: Creating a data set using files with the same name in many directories

Hi, Arthur.

Thanks for your suggestion. Sorry I did not respond earlier, but I did not see it. I think I can use part of your code, especially the simple use of DOS commands to create a file list and the SAS code beginning with ecmd. Can you please tell me what that line and the next line do? Ultimately, I want to merge a bunch of datasets with the same name.

I'll post a solution once I have one.

Thanks again, Bruce

Solution
‎02-09-2015 08:20 PM
Super User
Super User
Posts: 7,039

Re: Creating a data set using files with the same name in many directories

Posted in reply to brucehughw

You can ask DOS to generate the list of file names for you.

You can reference the dataset by using the file's name in quotes in a SET statement.

You can use the INDSNAME option to make the name available to your data step so that you can parse out the directory names.

%let path=c:\downloads ;

data files ;

  infile "dir /b/s &path\*.sas7bdat" pipe truncover ;

  input fname $256.;

run;

proc sql noprint ;

  select quote(trim(fname)) into :flist separated by ' '

    from files

  ;

quit;

data want ;

   length dsname $256 folder memname $32 ;

   set &flist indsname=dsname ;

   folder = scan(dsname,-2,'\.') ;

   memname = scan(dsname,-1,'\.');

run;

Contributor
Posts: 37

Re: Creating a data set using files with the same name in many directories

Posted in reply to brucehughw

Tom,

Thanks for the suggestion. When I run the first part of the code (data files ...), I get an error message: ERROR: Insufficient authorization to access PIPE.

Any idea how to correct this?

Thanks again, Bruce

Super User
Super User
Posts: 7,039

Re: Creating a data set using files with the same name in many directories

Posted in reply to brucehughw

Your site admins have turned off the ability to call operating system commands from within SAS.

You could run the DOS command in another window and save the output and then just read the output as a simple text file.

Or you could use SAS functions to open the directory and retrieve the file names.

For example look at this old thread. 

Also look at this paper that has a nice program for recursively reading the names of the files in a directory.

http://www.wuss.org/proceedings12/55.pdf

Contributor
Posts: 37

Re: Creating a data set using files with the same name in many directories

Posted in reply to brucehughw

Tom,

Thanks a lot for the suggestions. I'll try one (or more) and report here if one works. I think I like the paper.

Bruce

PROC Star
Posts: 7,471

Re: Creating a data set using files with the same name in many directories

Posted in reply to brucehughw

Bruce: If you run the first two lines of code from your window's command prompt, it will create the list of files:

cd c:\

dir dec30.sas7bdat /s /b > c:\temp\temp.txt

The following datastep will create all of the libnames:

data temp (keep=lname path);

    infile "c:\temp\temp.txt";

    length lname $8;

    length path $150;

    input;

    x=find(_infile_,"\",-200);

    lname=catt('xx',_n_);

    path=substr(_infile_,1,x-1);

    ecmd=cat("libname ",strip(lname),'"',strip(path),'";');

    call execute(ecmd);

run;

The following proc sql call will create a macro variable that will contain the string that can be used to set all of the files:

proc sql noprint;

  select lname||".dec30"

    into :dpaths

      separated by " "

        from temp

  ;

quit;

Finally, the following step will concatenate all of the files together. The only part I can't help you with, since I don't have 9.2, is regarding what will be returned with the indsname option.

data want;

  set &dpaths. indsname=dsn;

run;

Contributor
Posts: 37

Re: Creating a data set using files with the same name in many directories

Arthur,

Thanks so much for sharing your knowledge and time. As soon as I've had a chance to run the code, I'll flag your response with Correct Answer. Looking forward to using this.

thanks again, Bruce

Respected Advisor
Posts: 3,799

Re: Creating a data set using files with the same name in many directories

Posted in reply to brucehughw

Take another look at the suggestion from  dated Feb 9, 2015 7:22 PM

Contributor
Posts: 37

Re: Creating a data set using files with the same name in many directories

Posted in reply to data_null__

Hi,

I saw that solution and tried it. The PIPE command (or option) is disabled on my system, unfortunately, and it may not be simple to enable it. I think elements of that solution overlap with others, so I hope to use parts of it.

Thanks, Bruce

Respected Advisor
Posts: 3,799

Re: Creating a data set using files with the same name in many directories

Posted in reply to brucehughw

:smileyshocked:

In the reply to your complaint about PIPE told you how to fix it.  You have to create the path list as a file before you run SAS.  This nice part of using the path as the "data set name" is that INDSNAME then contains the path so you can create the origin variable you need directly.  No need for all those macro variables and LIBREFS.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 16 replies
  • 567 views
  • 6 likes
  • 5 in conversation