BookmarkSubscribeRSS Feed
LeinadBCN
Calcite | Level 5

Hello everyone,

 

I am new to this forum and pretty much to SAS in general, so if there are any beginner mistakes from my side, I apologise in advance!

 

My task ist to get the File size and the md5 checksum of some ZIP files I created with SAS.

 

To get the file size I used  the fopen and finfo functions, which worked very well.

 

In this SAS Blog I read, that you can acess the checksum of a zip file with finfo(fid, 'CR-32').

Sadly it does not work on my files. I get ‚202020202020202020202020‘ as a result, so i assume it is just a blank in the hex32 format. I have to say, I dind´t find this option in the official SAS documentation.

 

Why does this option not work on my Code? Is it because it does not work in general, or because the checksum information is not saved in the ZIP file?

 If getting the checksum with finfo is not possible, is there another simple alternative to get the checksum from an extern ZIP file with SAS? I created the ZIP files with the ODS Option in SAS, maybe it is simpler to get the checksum during the creation of the ZIP File? Any suggestions about that?

 

Thanks in advance!

data XXXX;

* Assign the fileref '_FILES' to the directory location;
FILENAME _FILES "XXXPATH";

FORMAT              md5                                      $hex32.


* Opens up the directory, dir is now the directory id;
dirid=dopen('_FILES');

* Numfiles is equivalent to the number of files in the directory;
numfiles=dnum(dirid);

* Loop around each file;
do i=1 to numfiles;
                * Identify the file as fname;
                Filename=dread(dirid,i);
                * Find the file in the location;
                file=" XXXPATH "||Filename;
                * Assign the fileref '_FNAMES' to this file;
                sysrc=filename('_FNAMES', file);
                * Open the file;
                fid          =fopen('_FNAMES');

                * If the files is open;
                if fid ^= 0 then do;
                               * Get all of the file information;
                               Bytes     =finfo(fid            ,'File Size (Bytes)');
                               md5       =finfo(fid            ,'md5');
                               * Output for each file in the directory;
                               output;
                               * Close the file;
                               sysrc=fclose(fid);
                end;
end;

* Close the directory;
rc=close(dirid);
run;

 

 

4 REPLIES 4
Kurt_Bremser
Super User

You can extract the possible information items by running a loop from 1 to FOPTNUM() and use the FOPTNAME() function, but I don't know of any checksums provided by operating systems.

LeinadBCN
Calcite | Level 5

Thank you for your answer. If I untersand it correctly. what you propose is the same as what I did with fopen() and finfo() just with another function.

Do you know any simple way to get the checksum of a zip file, that was created in SAS itself? Maybe a Code during the zipping process itself?

 

Thank you!

Kurt_Bremser
Super User

Checksums (for the purpose of verification) are created with utilities like md5 and not stored in the files themselves.

The ZIP file format contains checksums of the uncompressed data for each individual file within the archive, according to https://en.wikipedia.org/wiki/Zip_(file_format). I see no reference to a "whole file" checksum there, either.

 

So, if you want to have a checksum for a zip file, you best use a checksumming utility like md5.

Tom
Super User Tom
Super User

You probably will need to use some external program to extract that information.

For example if you have unzip command available you can use the -v option to get output like:

Archive:  test1.zip
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
    2954  Defl:N      415  86% 01-10-2017 23:15 2d31d6fc  dir1/test1.xml
   68096  Defl:N     9441  86% 02-19-2009 13:28 d368efe9  ../cepsusr.doc
   68096  Defl:N     9441  86% 02-19-2009 13:28 d368efe9  cepsusr.doc
--------          -------  ---                            -------
  139146            19297  86%                            3 files

So you can then just use the PIPE engine to run it and read the output into a dataset.

data files ;
  infile 'unzip -v test1.zip' pipe firstobs=4 truncover ;
  input @;
  if _infile_=:'------' then stop;
  input length method :$20. size cmpr :percent. date :mmddyy. time :time. crc_32 :$8. name $256. ;
  format date yymmdd10. time time5. ;
run;
proc print width=min;
run;

Results:

Obs    length    method    size    cmpr       date       time      crc_32          name

 1       2954    Defl:N     415    0.86    2017-01-10    23:15    2d31d6fc    dir1/test1.xml
 2      68096    Defl:N    9441    0.86    2009-02-19    13:28    d368efe9    ../cepsusr.doc
 3      68096    Defl:N    9441    0.86    2009-02-19    13:28    d368efe9    cepsusr.doc

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 4136 views
  • 1 like
  • 3 in conversation