BookmarkSubscribeRSS Feed
Coding4you
Obsidian | Level 7

Hi,

 

I'm trying to write a SAS code where I can count how many files exist in that folder. However, I'm struggling to execute the logic behind the code. In the scenario the output name =" dps_cais ". the output name is base on  different names, which is why it's in a macro variable. I'm trying to work out a solution to flag up as fail if it does equal the filename (see the code below) which should be always be 3 per output_name. 

 

data dps_file_count;
set WORK.FILENAME;
WHERE filename  like "%DPSCAIS%" ;

data dps_file_count;
set DPS_FILE_COUNT;
WHERE filename NOT like "%Rossen%";


/*Creating a output name loop unitl it matches the filename */
data check_filenames;
set DPS_FILE_COUNT;
flag = filename;
do until (flag = filename or eof);
set DPS_FILE_COUNT end = eof;
if filename = "&output_name._DPSCAIS_Interactive_Workbook_&date..xlsm"
or filename = "&output_name. DPSCAIS append_&date..csv"
or filename = "&output_name._DPSCAIS_Summary_Report_&date..xlsx" then flag = 1;
end;
run;


/*check what files is missing*/
PROC SQL;
create table file_compare as 
SELECT b.filename from check_filenames b
where filename not in (select filename from dps_file_count a where a.filename = b.filename);
quit;

PROC SQL;
create table COUNT_FILES AS SELECT COUNT(filename) AS COUNTS  FROM file_compare
;

DATA CHECK_NUM_FILES;
SET file_compare;
IF COUNTS >= 1 THEN FILE_COUNT = "FAIL" ;ELSE FILE_COUNT = "PASS";
9 REPLIES 9
PaigeMiller
Diamond | Level 26

There are plenty of threads in this forum which show how to read the names of the files in a folder into a SAS data set. Then, the number of records in the resulting data set is what you want.

 

For most situations, you need to write the above code without macros and without macro variables FIRST, before trying to turn it into something the requires macros and requires macro variables. So, please show us code for one folder that works without macros and without macro variables, before trying to turn this into a more general piece of code.

--
Paige Miller
Coding4you
Obsidian | Level 7

these are the files I managed to read from the folder.

 

 

DPS_File_count table

Filename 
DPS_cais_DPSCAIS_Interactive_Workbook_20230629.xlsm
DPS_cais_DPSCAIS_Summary_Workbook_20230629.xlsm

 

DPS_Check_filenames table 

Filename  Filename  Flag 
DPS_cais_DPSCAIS_Interactive_Workbook_20230629.xlsm DPS_cais_DPSCAIS_Interactive_Workbook_20230629.xlsm 1
DPS_cais_DPSCAIS_Summary_Workbook_20230629.xlsm DPS_cais_DPSCAIS_Summary_Workbook_20230629.xlsm 1

 

File_compare table (blank)

   

 

 

In the loop part of the codes the filename DPS_cais_DPSCAIS_append_20230629.csv is missing in the table which meant that the result will flag as pass instead of fail if that make sense 

 

 

 

 

 

 

PaigeMiller
Diamond | Level 26

Okay, so first you talk about "count" the number of files. You said "I'm trying to write a SAS code where I can count how many files exist in that folder." 

 

Now you write about something that isn't about counting, and in fact I'm not sure what it is you have just explained or how it relates to counting files, but there is a flag being computed and I don't think that has been explained.

 

Furthermore, I asked to see code without macros and without macro variables that works on one folder and does what you want. I still would like to see that code before I feel comfortable moving forward. In fact, I still consider such code a mandatory first step to write more generic code with macros and with macro variables.

--
Paige Miller
Tom
Super User Tom
Super User

I do not follow what you are trying to do.

 

You seem to be saying you found 2 files, both of which are XLSM files where the only difference is one of the many "words" in the named.  Summary vs Interactive.

 

What is the source of the information that let's you know that there is a missing file with a CSV extension?  It looks like perhaps you want a CSV file with a similar name, only with the strings 'Summary_Workbook' or 'Interactive_Workbook' in the name changed to the string 'appended' instead.

 

Is that what you are trying to test?

 

Do you expect all of the files to follow those naming conventions of:

A_B_C_X_Y_D.xlsm

A_B_C_Z_D.csv

 

Where A is DPS, B is cais, C is DPSCAIS, X is Interactive or Summary, Y is Workbook and D is 20230629 and Z is appended?

 

Which parts of ABCDXYZ vary?

Coding4you
Obsidian | Level 7

So the purpose of the checks is to identify if the correct number of  excel files has been outputted in the folder.

The output name always changes depending on the input name. I’ve already built this macro variable in so that it scan the input name. For every input name e.g. A we expect to see three files outputted:

The only thing that changes is the macro variable output name and date . If there is two Input file with input name A&B we should see six files. 

"&output_name._DPSCAIS_Interactive_Workbook_&date..xlsm"
or filename = "&output_name. DPSCAIS append_&date..csv"
or filename = "&output_name._DPSCAIS_Summary_Report_&date..xlsx"

 I think I managed to find the solution where I can create a table base of all the output filenames at that particular time and compare it with the what’s in the folder. 

ballardw
Super User

@Coding4you wrote:

So the purpose of the checks is to identify if the correct number of  excel files has been outputted in the folder.

The output name always changes depending on the input name. I’ve already built this macro variable in so that it scan the input name. For every input name e.g. A we expect to see three files outputted:

The only thing that changes is the macro variable output name and date . If there is two Input file with input name A&B we should see six files. 

"&output_name._DPSCAIS_Interactive_Workbook_&date..xlsm"
or filename = "&output_name. DPSCAIS append_&date..csv"
or filename = "&output_name._DPSCAIS_Summary_Report_&date..xlsx"

 I think I managed to find the solution where I can create a table base of all the output filenames at that particular time and compare it with the what’s in the folder. 


Look very closely at that file name for the CSV. Do you see the space between the . and DPSCAIS? and before append. The = comparison means that unless the actual file name has that space then the code won't (and likely shouldn't spaces in filenames are a bad idea) match.

Coding4you
Obsidian | Level 7

Thank you, I managed to compress the file so it’s like this now:
DPS_cais_DPSCAIS_Interactive_Workbook_20230629.xlsm DPS_cais_DPSCAIS_Summary_Workbook_20230629.xlsm DPS_cais_DPSCAISappend_20230629.csv

 

would a loop like this work?

data check_filenames;
    set DPS_FILE_COUNT;
    flag = 0; /* Initialize flag to 0 */
    
    /* Loop over the records */
    do until (eof);
        set DPS_FILE_COUNT end = eof;
        
        /* Check if the filename matches all conditions */
        if filename = "DPS_cais_DPSCAIS_Interactive_Workbook_&date..xlsm" and
           filename = "DPS_cais_DPSCAIS_append_&date..csv" and
           filename = "DPS_cais_DPSCAIS_Summary_Report_&date..xlsx" then
        do;
            flag = 1; /* Set flag to 1 to indicate a "pass" */
            output; /* Output the record to the check_filenames dataset */
        end;
    end;
run;

 

Tom
Super User Tom
Super User

So if the idea is that there should be all three of those filenames then you probably need three flags (or one flag that counts).

data _null_;
  set DPS_FILE_COUNT end=eof;
  retain flag 0;
  if filename in
     ("DPS_cais_DPSCAIS_Interactive_Workbook_&date..xlsm"
      "DPS_cais_DPSCAIS_append_&date..csv"
      "DPS_cais_DPSCAIS_Summary_Report_&date..xlsx"
     ) then flag+1
  ;
  if eof then do;
    put 'NOTE: Found ' flag "files for &date..";
    call symputx('nfiles',flag);
  end;
run;

%put NOTE: Found &nfiles files for &date.. ;

Now you can use the value of the macro variable NFILES to decide whether or not to do something else.

 

ballardw
Super User

@Coding4you wrote:

Thank you, I managed to compress the file so it’s like this now:
DPS_cais_DPSCAIS_Interactive_Workbook_20230629.xlsm DPS_cais_DPSCAIS_Summary_Workbook_20230629.xlsm DPS_cais_DPSCAISappend_20230629.csv

 

would a loop like this work?

data check_filenames;
    set DPS_FILE_COUNT;
    flag = 0; /* Initialize flag to 0 */
    
    /* Loop over the records */
    do until (eof);
        set DPS_FILE_COUNT end = eof;
        
        /* Check if the filename matches all conditions */
        if filename = "DPS_cais_DPSCAIS_Interactive_Workbook_&date..xlsm" and
           filename = "DPS_cais_DPSCAIS_append_&date..csv" and
           filename = "DPS_cais_DPSCAIS_Summary_Report_&date..xlsx" then
        do;
            flag = 1; /* Set flag to 1 to indicate a "pass" */
            output; /* Output the record to the check_filenames dataset */
        end;
    end;
run;

 


Not with AND. When you check a single variable for multiple values it will never be true for AND condition. A value cannot be 123 AND ABC at the same time.

 

Personally I would use

        flag = filename in(     
                   "DPS_cais_DPSCAIS_Interactive_Workbook_&date..xlsm",
                   "DPS_cais_DPSCAIS_append_&date..csv"               ,
                   "DPS_cais_DPSCAIS_Summary_Report_&date..xlsx"
                  );
       

IN tests if a variable is one of multiple values. The result of any logical comparison in SAS is 1 or 0 so you can assign values this way skipping the if/then to assign 1/0 when true/false.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1847 views
  • 1 like
  • 4 in conversation