DATA Step, Macro, Functions and more

Proc Import

Reply
Occasional Contributor
Posts: 5

Proc Import

Hi All,

 

I need your help on below question.

 

I am getting input file in excel format where File name is Like BankData123456 where the last six digit is changing with every new file. How will I create a macro to import the latest file from the particular path.

 

Please help me 

 

Regards

Ravi

PROC Star
Posts: 307

Re: Proc Import

What dictates the pattern of the suffix, if anything?

PROC Star
Posts: 307

Re: Proc Import

Posted in reply to collinelliot

That is, will the latest file be the highest number, or are you going to need to use the modified date or some other criterion to identify what counts as the latest?

Occasional Contributor
Posts: 5

Re: Proc Import

Posted in reply to collinelliot

Thank you for your reply. Every time file number is getting changes and the only way to identify the latest file is to see the system date and time.

Super User
Super User
Posts: 7,060

Re: Proc Import

[ Edited ]

In that case you could just use the operating system to order the files.

data _null_;
  infile "dir ""&path\BankData*.xlsx"" /b/o-d" pipe ;
  input ;
  if _n_=1 then call symputx('lastfile',_infile_);
run;
Super User
Super User
Posts: 7,060

Re: Proc Import

[ Edited ]

You will need to read the list of file names.  That is not hard and has been post about many times on this forum.  Since you didn't supply details let's assume that SAS is running on Windows so that you can use the DIR command.

%let path=\\server\sharename\directory_name ;
data files ;
  infile "dir /b ""&path\BankData*.xlsx""" pipe truncover ;
  input filename $255. ;
  length basename $255 ;
  basename = scan(filename,-2,'\.');
  number = input(substr(basename,9),32.);
run;

Then once you have the list of file names and the NUMBER associated with them you can then just find the last one.

%let lastname= ;
proc sql noprint;
  select filename into :lastname trimmed 
  from files
  having number = max(number)
  ;
quit;

Which you can then use in whatever code you use to conver the Excel file into a SAS dataset.

libname in xlsx "&lastname";
proc copy inlib=in outlib=work;
run;
PROC Star
Posts: 307

Re: Proc Import

Given that the file date determines the latest, you'll need to modify the solution a bit. I think "/b" is going give you just the file name, so I think if you remove it you'll get the dates for the files. This means, however, that you'll need an input statement that will read that date with the appropriate informat to get a SAS date value. Then just that use date in the place of @Tom's "number" in his code. I'm not on a computer with SAS right now, but the modification should be straightforward.

Valued Guide
Posts: 505

Re: Proc Import

Posted in reply to collinelliot
Importing the most recent excel workbook in a directory

I have a unix version but if I remember, the solution has SAS call Perl.
Dates have mutiple formates in unis 'ls -l'

inspred by
https://goo.gl/koClI1
https://communities.sas.com/t5/Base-SAS-Programming/Proc-Import/m-p/348515

If you find the code below too complex just extract the datasteps and
remove the DOSUBLs and the parent dataset

HAVE  (directory of d:/xls has following files)
====

4/09/2017   01:41 PM             7,827 BankData123400.xlsx  ==> most recent

04/09/2017  12:37 PM            47,673 BankData123456.xlsx
04/09/2017  11:20 AM             8,143 BankData123478.xlsx
04/09/2017  11:22 AM             8,144 BankData123488.xlsx

Here is what is in the most recent workbook

d:/xls/BankData123400.xlsx

 +------+-----------+
 |      |    A      |
 +------+-----------+
 |      |           |
 |    1 |  ELIGIBLE |
 |    2 |   TRUE    |
 |    3 |   FALSE   |
 |    4 |   TRUE    |
 |    5 |   FALSE   |
 |    6 |   TRUE    |
 | ...  |   ...     |
 +------------------+

 [SHEET1]

WANT (dataset work.BankData123400)

d:\xls\BankData123400.xlsx   09APR2017:13:41:00

Up to 40 obs from BankData123400 total obs=5

Obs    ELIGIBLE

 1       TRUE
 2       FALS
 3       TRUE
 4       FALS
 5       TRUE


WORKING CODE
============

     libname xel '&pth.&names.';
     data %scan(&names,1,'.');
         set xel.'sheet1$'n;
     run;quit;

FULL SOLUTION
=============


%symdel creation names pth;
proc datasets lib=work kill;
run;quit;
libname xel clear;

%let pth=d:\xls\;

data _null_;

  if _n_=0 then do;
     rc=%sysfunc(dosubl('
         filename pipes pipe "DIR /TW /Q /S &pth.";
         data __dir(where=(size>0 and index(name,'.xls')>0));
            infile pipes truncover end=eof;
            do until(eof);
               input @;
               if index(_infile_,"Directory of")>0 then input @ "Directory of " dir $128.;
               else do;
                  input creation ?? mdyampm18. @;
                  if not missing(creation) then do;
                     input @26 size:??comma16.  @40 owner:$32. @63 name:$64.;
                     output;
                  end;
                  else input;
               end;
            end;
            stop;
         run;quit;
         data _null_;
           length nam $64;
           retain dtemke . nam;
           do until (dne);
              set __dir end=dne;
              if creation > dtemke then do;
                   put _all_;
                   dtemke=creation;
                   nam   = name;
              end;
           end;
           call symputx("creation",put(dtemke,datetime23.));
           call symputx("names",nam);
           stop;
         run;quit;
      '));
  end;

  put "&pth &creation &names ";

  /*
     d:\xls\BankData123400.xlsx   09APR2017:13:41:00
  */

  rc=dosubl("
     libname xel '&pth.&names.';
     data %scan(&names,1,'.');
         set xel.'sheet1$'n;
     run;quit;
  ");

run;quit;

Libref         WORK
Engine         V9
Physical Name  e:\saswork\wrk\_TD4456_BEAST_
Filename       e:\saswork\wrk\_TD4456_BEAST_


                   Member   Obs, Entries                  File
#  Name            Type      or Indexes   Vars  Label     Size  Last Modified

1  BANKDATA123400  DATA           5        1            131072  04/09/2017 13:51:13
2  SASGOPT         CATALOG        0                       5120  04/09/2017 12:24:01
3  SASMACR         CATALOG       63                       5120  04/09/2017 12:20:41
4  __DIR           DATA           4        5            131072  04/09/2017 13:51:13
NOTE: Deleting WORK.BANKDATA123400 (memtype=DATA).
NOTE: Deleting WORK.SASGOPT (memtype=CATALOG).
NOTE: File WORK.SASGOPT (memtype=CATALOG) cannot be deleted because it is in use.
NOTE: Deleting WORK.SASMACR (memtype=CATALOG).
NOTE: File WORK.SASMACR (memtype=CATALOG) cannot be deleted because it is in use.
NOTE: Deleting WORK.__DIR (memtype=DATA).
2041  run;

2041!     quit;

NOTE: PROCEDURE DATASETS used (Total process time):
      real time           0.02 seconds
      user cpu time       0.01 seconds
      system cpu time     0.00 seconds
      memory              302.46k
      OS Memory           14316.00k
      Timestamp           04/09/2017 01:52:11 PM
      Step Count                        100  Switch Count  0


2042  libname xel clear;
WARNING: Libref XEL is not assigned.
2043  %let pth=d:\xls\;
2044  data _null_;
2045    if _n_=0 then do;
2046       rc=%sysfunc(dosubl('
2047           filename pipes pipe "DIR /TW /Q /S &pth.";
2048           data __dir(where=(size>0 and index(name,'.xls')>0));
2049              infile pipes truncover end=eof;
2050              do until(eof);
2051                 input @;
2052                 if index(_infile_,"Directory of")>0 then input @ "Directory of " dir $128.;
2053                 else do;
2054                    input creation ?? mdyampm18. @;
2055                    if not missing(creation) then do;
2056                       input @26 size:??comma16.  @40 owner:$32. @63 name:$64.;
2057                       output;
2058                    end;
2059                    else input;
2060                 end;
2061              end;
2062              stop;
2063           run;quit;
2064           data _null_;
2065             length nam $64;
2066             retain dtemke . nam;
2067             do until (dne);
2068                set __dir end=dne;
2069                if creation > dtemke then do;
2070                     put _all_;
2071                     dtemke=creation;
2072                     nam   = name;
2073                end;
2074             end;
SYMBOLGEN:  Macro variable PTH resolves to d:\xls\
NOTE: The infile PIPES is:
      Unnamed Pipe Access Device,
      PROCESS=DIR /TW /Q /S d:\xls\,RECFM=V,
      LRECL=384

NOTE: 16 records were read from the infile PIPES.
      The minimum record length was 0.
      The maximum record length was 81.
NOTE: The data set WORK.__DIR has 4 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      user cpu time       0.00 seconds
      system cpu time     0.01 seconds
      memory              874.25k
      OS Memory           14316.00k
      Timestamp           04/09/2017 01:52:12 PM
      Step Count                        101  Switch Count  0


NAM=  DTEMKE=. DNE=0 DIR=d:\xls CREATION=1807364460 SIZE=7827 OWNER=BEAST\beast
NAME=BankData123400.xlsx _ERROR_=0 _N_=1
NOTE: There were 4 observations read from the data set WORK.__DIR.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      user cpu time       0.00 seconds
      system cpu time     0.00 seconds
      memory              874.25k
      OS Memory           14316.00k
      Timestamp           04/09/2017 01:52:12 PM
      Step Count                        101  Switch Count  0


2075             call symputx("creation",put(dtemke,datetime23.));
2076             call symputx("names",nam);
2077             stop;
2078           run;quit;
2079        '));
2080    end;
SYMBOLGEN:  Macro variable PTH resolves to d:\xls\
SYMBOLGEN:  Macro variable CREATION resolves to 09APR2017:13:41:00
SYMBOLGEN:  Macro variable NAMES resolves to BankData123400.xlsx
2081    put "&pth &creation &names ";
2082    /*
2083       d:\xls\BankData123400.xlsx   09APR2017:13:41:00
2084    */
2085    rc=dosubl("
2086       libname xel '&pth.&names.';
SYMBOLGEN:  Macro variable PTH resolves to d:\xls\
SYMBOLGEN:  Macro variable NAMES resolves to BankData123400.xlsx
2087       data %scan(&names,1,'.');
SYMBOLGEN:  Macro variable NAMES resolves to BankData123400.xlsx
2088           set xel.'sheet1$'n;
2089       run;quit;
2090    ");
2091  run;

d:\xls\ 09APR2017:13:41:00 BankData123400.xlsx
NOTE: Libref XEL was successfully assigned as follows:
      Engine:        EXCEL
      Physical Name: d:\xls\BankData123400.xlsx
NOTE: There were 5 observations read from the data set XEL.'sheet1$'n.
NOTE: The data set WORK.BANKDATA123400 has 5 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      user cpu time       0.01 seconds
      system cpu time     0.00 seconds
      memory              1067.18k
      OS Memory           14316.00k
      Timestamp           04/09/2017 01:52:12 PM
      Step Count                        101  Switch Count  0


NOTE: DATA statement used (Total process time):
      real time           0.40 seconds
      user cpu time       0.09 seconds
      system cpu time     0.20 seconds
      memory              1067.18k
      OS Memory           14316.00k
      Timestamp           04/09/2017 01:52:12 PM
      Step Count                        101  Switch Count  3

2091!     quit;

SYMBOLGEN:  Macro variable PGM resolves to utl_select_most_recent_file
SYMBOLGEN:  Macro variable PGM resolves to utl_select_most_recent_file
SYMBOLGEN:  Macro variable _Q resolves to 44441
SYMBOLGEN:  Macro variable _Q resolves to 44441



Ask a Question
Discussion stats
  • 7 replies
  • 238 views
  • 1 like
  • 4 in conversation