SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

check dataset compression without resetting last access date

Accepted Solution Solved
Reply
Contributor
Posts: 39
Accepted Solution

check dataset compression without resetting last access date

Hi,

 

I need to be able to check whether a dataset is compressed.  Can do this by using DICTIONARY.TABLES to obtain the name of the SAS data set (memname) and check whether the SAS data set is compressed (COMPRESS).  However, this will reset the last access date which I am trying to avoid.   (We run file cleanup reports for users based on last access date)

 

So, my question -- does anyone know if there is another way to check whether a dataset is compressed and not reset the access date?

 

- Alan 

 


Accepted Solutions
Solution
‎12-19-2017 10:06 AM
Super User
Posts: 9,880

Re: check dataset compression without resetting last access date

[ Edited ]

Any good operating system lets you determine the current last access date and set it with a utility; you will need to do some shell scripting for that.

The idea would be to work through a list of files, determine the access date (eg with ls -ul) and, after checking for the compress, set it again with touch -a -d

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Super User
Super User
Posts: 7,932

Re: check dataset compression without resetting last access date

What last access date? 

 

There is no other place than in the actual dataset where metadata like that is stored so to read it you need to read the dataset.

Unless you setup your own method to save a copy of that metadata and a method to keep it up to date.

Super User
Posts: 3,856

Re: check dataset compression without resetting last access date

What about PROC CONTENTS or DATASETS with the CONTENTS statement?

Contributor
Posts: 39

Re: check dataset compression without resetting last access date

Thanks for the responses.  Was trying to keep from resetting the LAST access date.  It appears that any of the 3 (DICTIONARY.TABLES, Proc Contents, Proc Datasets) will reset the access date -- which is problematic for what we are trying to accomplish. 

 

The idea here was to check if users were compressing their large datasets (compress=yes or compress=binary) -- with a need towards decreasing space usage.   At the same time, not reset the last access date -- since users get a report of files they have not recently accessed.  Need to step back and rethink . . . may have to rely on the last modified date instead.   

 

Thanks again.

 

- Alan

Solution
‎12-19-2017 10:06 AM
Super User
Posts: 9,880

Re: check dataset compression without resetting last access date

[ Edited ]

Any good operating system lets you determine the current last access date and set it with a utility; you will need to do some shell scripting for that.

The idea would be to work through a list of files, determine the access date (eg with ls -ul) and, after checking for the compress, set it again with touch -a -d

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Posts: 5,849

Re: check dataset compression without resetting last access date

Shouldn't your last access report be able to distinguish between users?
Data never sleeps
Super User
Posts: 13,300

Re: check dataset compression without resetting last access date

I just ran this code on my Windows 7 system:

proc sql;
   select *
   from dictionary.tables
   where libname='IPP' and memname='CY1995UNDUP';
QUIT;

And the file system shows the last accessed as 10/14/2011. So if the Compression Routine and or Percent Compression have the info you need that seems to work in at least one environment.

 

Super User
Posts: 9,880

Re: check dataset compression without resetting last access date

To do a counter-check, I first created this macro:

%macro ls(dir=,out=,tim=);
data _null_;
if "&tim" not in ("A","a")
then call symputx('tim','','l');
else call symputx('tim','--time=atime','l');
run;

filename ls_oscmd pipe "/usr/linux/bin/ls -l &tim --time-style=full-iso &dir 2>&1";

%if "&out" = "" %then %let out=_null_;
data &out;
infile ls_oscmd truncover;
length
  perms $10
  links 3
  user $8
  group $8
  size 8
  date 4
  time 5
  tz $5
  name $200
;
informat
  date yymmdd10.
  time time19.
;
format
  date yymmdd10.
  time time8.
;
input
  perms
  links
  user
  group
  size
  date
  time
  tz
  name $200.
;
if perms ne 'total';
if upcase("&out") = "_NULL_" then put _all_;
run;

filename ls_oscmd clear;
%mend;

which uses the GNU ls on AIX to retrieve a directory listing with a long ISO timestamp.

Then I ran this code:

data work.test;
x1 = 1;
run;

%ls(dir=%sysfunc(pathname(WORK)),out=dirlist1,tim=a);

data _null_;
x = sleep(5,1);
run;

proc sql;
create table memlist as
select * from dictionary.tables where libname = 'WORK' and memname = 'TEST';
quit;

%ls(dir=%sysfunc(pathname(WORK)),out=dirlist2,tim=a);

proc print data=dirlist1 noobs;
where name = 'test.sas7bdat';
run;

proc print data=dirlist2 noobs;
where name = 'test.sas7bdat';
run;

which produced this output:

  perms       links    user      group       size           date        time     tz          name

-rw-r--r--      1      e9782    sasadmin    131072    2017-12-19     9:06:33    +0100    test.sas7bdat

  perms       links    user      group       size           date        time     tz          name

-rw-r--r--      1      e9782    sasadmin    131072    2017-12-19     9:06:38    +0100    test.sas7bdat

As you can see, the access done by the proc sql is dutifully recorded by the operating system. I assume that other UNIXen will behave the same.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 39

Re: check dataset compression without resetting last access date

Thanks everyone for the input. 

 

I think   hit on the solution.  That is, first, for a given user, run a check to see if a file is compressed, then reset the last access date back to what it was before the check was run.  However, I find that I do not have the permissions needed to reset the last access date (going back in time).  Since I am not one of the official SAS admin at my site, it would prove too much of a battle to try to get those permissions.  I will have to find a plan B.

 

thanks again,

Alan 

Super User
Posts: 9,880

Re: check dataset compression without resetting last access date

Talk to your admins. They should be able to install a script that does the atime reset via sudo, so you get root permissions only for this special action. I have done the same (on AIX) so that the batch processing user can query the TSM database or kill processes that get in the way of production jobs.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 446 views
  • 3 likes
  • 6 in conversation