<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Read in CSV from folder with newest date in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709419#M218139</link>
    <description>&lt;P&gt;Just test if the beginning of the name matches that pattern.&amp;nbsp; For example you could only write the observation if the filename matches.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;if files =: '&lt;SPAN&gt;TestFile_' then &lt;/SPAN&gt;output;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Watch out for the case of the letters used in the filenames. SAS string comparisons are case sensitive. And on a UNIX filesystems TestFile, TESTFILE and testfile are three different filenames.&lt;/P&gt;</description>
    <pubDate>Tue, 05 Jan 2021 14:11:17 GMT</pubDate>
    <dc:creator>Tom</dc:creator>
    <dc:date>2021-01-05T14:11:17Z</dc:date>
    <item>
      <title>Read in CSV from folder with newest date</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709250#M218059</link>
      <description>&lt;P&gt;Hi Everyone,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've seen some posts about reading in txt files and excel files from a folder based on a date, so this may seem like a repeat, however, I'm not familiar enough with it to adapt to my current requirements, so I'm hoping to get a solid example that's applicable.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm looking for a data step that can scan a specific folder and pull in the csv with the newest date in the file name.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've adapted my file names to read TestFile_20201230 as I've read on other threads that YYYYMMDD is the best approach (I can change this if someone has a better suggestion).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Example file path could be 'stage/documents/testreports/TestFile_20201230.csv'&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any and all help would be greatly appreciated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 04 Jan 2021 17:16:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709250#M218059</guid>
      <dc:creator>BlayLay</dc:creator>
      <dc:date>2021-01-04T17:16:34Z</dc:date>
    </item>
    <item>
      <title>Re: Read in CSV from folder with newest date</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709252#M218061</link>
      <description>&lt;P&gt;This is where you the programmer interacts with the operating system&amp;nbsp; So this is for a UNIX shell (IDK which).&amp;nbsp; I assume by your slash orientation that you are using UNIX, am I correct?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Taking from&amp;nbsp;&lt;A href="https://communities.sas.com/t5/SAS-Programming/Reading-filenames-date-and-time-from-a-directory/m-p/452297" target="_blank" rel="noopener"&gt;Reading filenames, date and time from a directory - SAS Support Communities&lt;/A&gt;&amp;nbsp;.&amp;nbsp;I don't&amp;nbsp; use UNIX so I cant verify that this works for my system.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="sas"&gt;%let subdir =/stage/documents/testreports/;
filename dir pipe "ls -1 &amp;amp;subdir | grep *.csv";

data Pfile;
  length filename $ 300 size $ 20 time $ 20;
  infile dir truncover expandtabs ;
  input filename size time$;
run;&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But basically you are reading a text file that is the result of a UNIX command's output.&amp;nbsp; If one wants the file date of certain files, one uses the UNIX command LS.&amp;nbsp; The GREP command helps filter all lines in the output that contain "csv", hopefully that selects only ".csv" files.&amp;nbsp; This could be a WHERE statement, but that could get messy in SAS.&amp;nbsp; GREP is our friend here.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Once you read the text file and assign the columns of text to variables in your data step, then you need to select your filename and find the maximum date.&amp;nbsp; Since your files names have dates you can add a date variable .&amp;nbsp; You need to check and change my guess of "@10" in the INPUT statement works to get your dates.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data Pfile;
  length filename $ 300 size $ 20 time $ 20;
  infile dir truncover expandtabs ;
  input filename 
        @10 date yymmdd8.;
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 04 Jan 2021 17:44:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709252#M218061</guid>
      <dc:creator>PhilC</dc:creator>
      <dc:date>2021-01-04T17:44:54Z</dc:date>
    </item>
    <item>
      <title>Re: Read in CSV from folder with newest date</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709267#M218069</link>
      <description>&lt;P&gt;Do it in steps.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;First figure out how to get the list of filenames.&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;If your SAS session allows you to run operating system commands then use the PIPE engine to read the output of the ls command.&amp;nbsp; Note that on Unix a path that does not start with the root node (that is a slash) is considered relative the the current working directory so for this example I have added the leading slash to your path.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data files ;
  input "ls /stage/documents/testreports/TestFile_*.csv" pipe truncover;
  input filename $256. ;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Otherwise look for the method of using the DOPEN() and DREAD() statements to get the list of files in a directory in the SAS documentation.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Then you can figure out how to parse out the DATE from the name.&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;So if the names always look exactly like your example simple SCAN() and INPUT() functions should do that.&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data file_dates;
  set files;
  date = input(scan(filename,-2,'._'),yymmdd10.);
  format date yymmdd10.;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Note that if the names always use the same prefix and suffix then you can skip extracting the DATE and just sort by the NAME. That is the advantage of using YMD order for the date strings.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And then how to find the latest date.&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=file_dates;
  by descending date ;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Once you have done that then you can use that value to drive reading the file.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want ;
  if _n_=1 then set file_dates(obs=1);
   infile csv filevar=filename dsd truncover firstobs=2;
  input var1 var2 .... ;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 04 Jan 2021 18:30:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709267#M218069</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2021-01-04T18:30:25Z</dc:date>
    </item>
    <item>
      <title>Re: Read in CSV from folder with newest date</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709410#M218135</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I was able to build a list of filenames from the directory, using the below code&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;data file(keep=files);
     rc=filename("mydir","/stage/documents/testreports";
     did = dopen("mydir");
     if did &amp;gt; 0 then
        do i = 1 to dnum(did);
        files=dread(did,i);
        output;
      end;
    rc=dclose(did);
run;&lt;/PRE&gt;&lt;P&gt;and then followed your remaining steps to parse out and find the latest date. The only remaining question I have is how to only open the file IF the prefix is TestFile_&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jan 2021 13:13:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709410#M218135</guid>
      <dc:creator>BlayLay</dc:creator>
      <dc:date>2021-01-05T13:13:55Z</dc:date>
    </item>
    <item>
      <title>Re: Read in CSV from folder with newest date</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709419#M218139</link>
      <description>&lt;P&gt;Just test if the beginning of the name matches that pattern.&amp;nbsp; For example you could only write the observation if the filename matches.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;if files =: '&lt;SPAN&gt;TestFile_' then &lt;/SPAN&gt;output;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Watch out for the case of the letters used in the filenames. SAS string comparisons are case sensitive. And on a UNIX filesystems TestFile, TESTFILE and testfile are three different filenames.&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jan 2021 14:11:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709419#M218139</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2021-01-05T14:11:17Z</dc:date>
    </item>
    <item>
      <title>Re: Read in CSV from folder with newest date</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709420#M218140</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/349857"&gt;@BlayLay&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I was able to build a list of filenames from the directory, using the below code&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data file(keep=files);
     rc=filename("mydir","/stage/documents/testreports";
     did = dopen("mydir");
     if did &amp;gt; 0 then
        do i = 1 to dnum(did);
        files=dread(did,i);
        output;
      end;
    rc=dclose(did);
run;&lt;/PRE&gt;
&lt;P&gt;and then followed your remaining steps to parse out and find the latest date. The only remaining question I have is how to only open the file IF the prefix is TestFile_&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Check the filename before writing it to the dataset.&lt;/P&gt;
&lt;P&gt;Maybe like this:&lt;/P&gt;
&lt;PRE&gt;data file(keep=files);
  rc = filename("mydir", "/stage/documents/testreports";
  did = dopen("mydir");
  
  if did &amp;gt; 0 then do;
    do i = 1 to dnum(did);
      files = dread(did, i);

      if files =: 'TestFile_' then output;
    end;
  end;
  
  rc = dclose(did);
run;&lt;/PRE&gt;</description>
      <pubDate>Tue, 05 Jan 2021 14:06:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Read-in-CSV-from-folder-with-newest-date/m-p/709420#M218140</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2021-01-05T14:06:35Z</dc:date>
    </item>
  </channel>
</rss>

