Hello guys,
I need a big help.
Here at work, we identified some ways to optimize our disk space.
Below I have listed what has been identified as well and what needs to be done. I would like to know of you how to do it
1)
a) Identifield: Sas data file without access by users
b) Needed: List identifying the sas data file with no access..
2)
a) Identifield: Filesystems unused
b) Needed: List with identification of unused Filesystems.
3)
a) Identifield: Existence of compacted sas data file
b) Needed:List with identification of compacted sas data files and processes that use them.
4)
a) Identifield: Existence of sas data file with the same name
b) Needed: List identifying the sas data files with the same name but in different directories
5)
a) Identifield: Compression sas data file
b) Needed: A list of sas data file which may be compressed
6)
a) Identifield: Shared filesystems
b) Needed: A list of Shared filesystems
If you have some suggestions for best practices, please feel free to share,
Thanks in advance.
Augusto, your list is looking like some chosen checkpoint made by an auditor.
1/ data without access by users
*) Needed: List identifying the sas data file with no access..
This is monitoring of all activity on all data on the OS level. With Arm log4j being active SAS can do a lot of this.
You only can get a list if data that is used. Unused ones will cause no events. Use the list of all data that is present with that. (3,4,5)
2) Unused space / Filesystems
*) Needed: List with identification of unused Filesystems.
This is an OS task (xcmd usage) the df command woll show all filesystems. With a decent setup you should see what is stores where and how much is free.
The du command will make a list of the space in use.
3) Existence of compacted sas data file
*) Needed:List with identification of compacted sas data files and processes that use them.
Make a list of all directories with sas-datasets. The attributes off all datasets can be gathered as a new datasets. One of the elements is compress
Also available as a SASHELP.v----- view member.
The files that are compacted using zip or another to be analyzed at OS elvel. Make a list at OS level (ls command using xmcd) of all OS files (recursive option is there)
4) Existence of sas data file with the same name
* ) Needed: List identifying the sas data files with the same name but in different directories
Having a list of all OS files have that on split in path name and filename. Just ordering will give you an answer
5) Compression sas data file
*) Needed: A list of sas data file which may be compressed
Having a list of all SAS datasets with all their attributes. look for the ones that are not compressed and having long records/many variables many observations
6) Shared filesystems
*b) Needed: A list of Shared filesystems
Shared filesystems is an OS task. At Windows it is very common usage to propagate to a desktop. Within in a –Nix system it is a “no done” situation as of too many possible security holes. (mostly forbidden)
With a SAS grid approach it could be there. Check with the storage admin / system admin.
Augusto, your list is looking like some chosen checkpoint made by an auditor.
1/ data without access by users
*) Needed: List identifying the sas data file with no access..
This is monitoring of all activity on all data on the OS level. With Arm log4j being active SAS can do a lot of this.
You only can get a list if data that is used. Unused ones will cause no events. Use the list of all data that is present with that. (3,4,5)
2) Unused space / Filesystems
*) Needed: List with identification of unused Filesystems.
This is an OS task (xcmd usage) the df command woll show all filesystems. With a decent setup you should see what is stores where and how much is free.
The du command will make a list of the space in use.
3) Existence of compacted sas data file
*) Needed:List with identification of compacted sas data files and processes that use them.
Make a list of all directories with sas-datasets. The attributes off all datasets can be gathered as a new datasets. One of the elements is compress
Also available as a SASHELP.v----- view member.
The files that are compacted using zip or another to be analyzed at OS elvel. Make a list at OS level (ls command using xmcd) of all OS files (recursive option is there)
4) Existence of sas data file with the same name
* ) Needed: List identifying the sas data files with the same name but in different directories
Having a list of all OS files have that on split in path name and filename. Just ordering will give you an answer
5) Compression sas data file
*) Needed: A list of sas data file which may be compressed
Having a list of all SAS datasets with all their attributes. look for the ones that are not compressed and having long records/many variables many observations
6) Shared filesystems
*b) Needed: A list of Shared filesystems
Shared filesystems is an OS task. At Windows it is very common usage to propagate to a desktop. Within in a –Nix system it is a “no done” situation as of too many possible security holes. (mostly forbidden)
With a SAS grid approach it could be there. Check with the storage admin / system admin.
4) (UNIX)
filename oscmd pipe 'find / -name \*.sas7bdat -print';
data files;
infile oscmd lrecl=500 truncover;
length
path $500
fname $200
;
input path;
fname = lowcase(scan(path,-1,'/'));
run;
proc sort data=files;
by fname;
run;
data int (keep=fname);
set files;
retain count;
if first.fname then count=1; else count+1;
if last.fname and count > 1 then output;
run;
data result;
merge
files
int (in=_int)
;
by fname;
if _int;
run;
5)
Get the attributes from sashelp.vtables (make sure that ALL directories containing SAS fileshave a libname assigned, as sashelp.vtables is dynamically created from all currently defined&assigned libraries.
BTW: please refrain from double-posting. Makes only life harder for all involved.
I think that Jaap covered the bullets quite extensively.
A general thought. It sounds like that your users have quite some freedom to create data. Instead of chasing ghosts you could centralise your data management. Allowing data to be stored in specific locations.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.