03-13-2015 08:01 AM
Have general question regarding archiving SAS data sets.
So shortly I'm working with daily backuping process, main functionality finished, now I need to archivate each of the backed up(off course compressed) sas data sets.
I made small investigation and made 2 scripts on vbs, executet from sas by X comnmand.
First vbs make ZIP archive from SAS data set.
The vbs code is small for such variant(just a few lines), archivation time is quick, but the minus is that big data sets(>100 mb) compreesed only by 50%, so target .zip archive become approximately only 2 times smaller.
Second VBS use 7Z archivator, vbs code a little bit bigger then in first case, require 7z exe file exists on machine etc., but compress much better ~10%, so result archive ~10 times smaller then initial SAS data set.
But archivation time obviosuly a few times bigger then in case of ZIP archivation.
So question is - what is the best, maybe more efficient archivator for SAS data sets files(both data set and index files).
From another word maybe someone had similar tasks before and made deep research that results in fact that archivator XXX is the best for compressing SAS data sets?
03-13-2015 09:17 AM
The question arises, do you just want to keep old files, or do you want to use them in a more meaningful way? If its just keeping old files, then compression is probably ok, however bear in mind that each file takes up room so space can be an issue, and finding things/comparisons etc. If you want a back then maybe just buy a tape backup machine and run tape backups once in a while.
If however you want to use the data or keep audit trails etc. then you would be best served looking into databases/warehouses. These provided audit trails where you can rebuild the data step by step (i.e. only keep the changes from baseline), so keeping only the necessary change information, far smaller than compressing the full item each time. The data is all in one place and accessible etc. However there are downsides, you need to admin the system, provide some kind of ETL etc.
03-16-2015 05:32 AM
Yep, just need keep old files, so compression looks lilke ok.
Offcourse company has daily files backup system, tape backups.
Thanks for good explanations, I also googled about 7z before, that's why actually choose it , but I just thought that all SAS data sets has more-less same structure so maybe some other archivator is better specially for .sas7bdat and .sas7bndx files.
But looks like 7z is the best anyway
03-16-2015 05:59 AM
You might also look at gzip (Gzip for Windows), which is available for all common platforms (making future migrations easier) and very easy to script.
03-16-2015 08:12 AM
Yes, but archiving process can fail for a some reasons, so just run 7z.exe with needed parameters probably a little bit unsecure.
I used smth like:
Set objectShell = CreateObject("WScript.Shell")
retVal = objectShell.Run(...)
If retVal <> 0 Then
' write to log worning that archivation failed