Hello everyone,
I am new to this forum and pretty much to SAS in general, so if there are any beginner mistakes from my side, I apologise in advance!
My task ist to get the File size and the md5 checksum of some ZIP files I created with SAS.
To get the file size I used the fopen and finfo functions, which worked very well.
In this SAS Blog I read, that you can acess the checksum of a zip file with finfo(fid, 'CR-32').
Sadly it does not work on my files. I get ‚202020202020202020202020‘ as a result, so i assume it is just a blank in the hex32 format. I have to say, I dind´t find this option in the official SAS documentation.
Why does this option not work on my Code? Is it because it does not work in general, or because the checksum information is not saved in the ZIP file?
If getting the checksum with finfo is not possible, is there another simple alternative to get the checksum from an extern ZIP file with SAS? I created the ZIP files with the ODS Option in SAS, maybe it is simpler to get the checksum during the creation of the ZIP File? Any suggestions about that?
Thanks in advance!
data XXXX;
* Assign the fileref '_FILES' to the directory location;
FILENAME _FILES "XXXPATH";
FORMAT md5 $hex32.
* Opens up the directory, dir is now the directory id;
dirid=dopen('_FILES');
* Numfiles is equivalent to the number of files in the directory;
numfiles=dnum(dirid);
* Loop around each file;
do i=1 to numfiles;
* Identify the file as fname;
Filename=dread(dirid,i);
* Find the file in the location;
file=" XXXPATH "||Filename;
* Assign the fileref '_FNAMES' to this file;
sysrc=filename('_FNAMES', file);
* Open the file;
fid =fopen('_FNAMES');
* If the files is open;
if fid ^= 0 then do;
* Get all of the file information;
Bytes =finfo(fid ,'File Size (Bytes)');
md5 =finfo(fid ,'md5');
* Output for each file in the directory;
output;
* Close the file;
sysrc=fclose(fid);
end;
end;
* Close the directory;
rc=close(dirid);
run;
You can extract the possible information items by running a loop from 1 to FOPTNUM() and use the FOPTNAME() function, but I don't know of any checksums provided by operating systems.
Thank you for your answer. If I untersand it correctly. what you propose is the same as what I did with fopen() and finfo() just with another function.
Do you know any simple way to get the checksum of a zip file, that was created in SAS itself? Maybe a Code during the zipping process itself?
Thank you!
Checksums (for the purpose of verification) are created with utilities like md5 and not stored in the files themselves.
The ZIP file format contains checksums of the uncompressed data for each individual file within the archive, according to https://en.wikipedia.org/wiki/Zip_(file_format). I see no reference to a "whole file" checksum there, either.
So, if you want to have a checksum for a zip file, you best use a checksumming utility like md5.
You probably will need to use some external program to extract that information.
For example if you have unzip command available you can use the -v option to get output like:
Archive: test1.zip Length Method Size Cmpr Date Time CRC-32 Name -------- ------ ------- ---- ---------- ----- -------- ---- 2954 Defl:N 415 86% 01-10-2017 23:15 2d31d6fc dir1/test1.xml 68096 Defl:N 9441 86% 02-19-2009 13:28 d368efe9 ../cepsusr.doc 68096 Defl:N 9441 86% 02-19-2009 13:28 d368efe9 cepsusr.doc -------- ------- --- ------- 139146 19297 86% 3 files
So you can then just use the PIPE engine to run it and read the output into a dataset.
data files ;
infile 'unzip -v test1.zip' pipe firstobs=4 truncover ;
input @;
if _infile_=:'------' then stop;
input length method :$20. size cmpr :percent. date :mmddyy. time :time. crc_32 :$8. name $256. ;
format date yymmdd10. time time5. ;
run;
proc print width=min;
run;
Results:
Obs length method size cmpr date time crc_32 name 1 2954 Defl:N 415 0.86 2017-01-10 23:15 2d31d6fc dir1/test1.xml 2 68096 Defl:N 9441 0.86 2009-02-19 13:28 d368efe9 ../cepsusr.doc 3 68096 Defl:N 9441 0.86 2009-02-19 13:28 d368efe9 cepsusr.doc
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.