BookmarkSubscribeRSS Feed
mmajorza52
Calcite | Level 5

We are running SAS 9.3 on a Linux server and we have about 250 DataMarts spread across multiple file systems/directories. We have approx 3,000 users running adhoc queries against these DataMarts.  These DataMarts contain multiple tables and views.  Is there a way to determine what tables are being used and what columns are being used?  We are in the process of "consolidating/house-cleaning".  We want to identify "what" is being used and "what" is NOT being used at the COLUMN level.

Thanks

14 REPLIES 14
RW9
Diamond | Level 26 RW9
Diamond | Level 26

What do you mean by 'what is/not being used at the column level'?  Do you mean is being accessed at any specific time point, or is utilized in the queries?  If it is file access I haven't got a clue I am afraid, especially at the column level.  At a file level you could check the file handles to see if it is being accessed.

If it is at the query level, then those queries are will be stored somewhere, text files, metadata server etc.  You could read those and process the 'select' statements to find out what tables/columns are used in the query.

mmajorza52
Calcite | Level 5

To simplify, let's say i have 3 SAS Datasets under 3 differently directory paths called CUSTOMER.  One has 50 columns, one has 60 columns, and one has 90 columns.  Is there anyway to identify, who is accessing these tables via SAS Data or SAS PROCs and identify which columns are being requested (read).  We are in the early stages of trying to consolidate "liked" data and remove  columns not being used  I know in NETEZZA, there is a system table that contains all SQL calls to a table.  Was wondering if anything existed for SAS.

JBailey
Barite | Level 11

I am by no means an expert on this, but I think you would have to save all the SAS logs and post process them. I don't know of a magic switch that you can turn on to monitor a SAS data set in this fashion.

My comment is from a SAS Foundation, not a SAS solution, perspective. I assume the question is regarding SAS Foundation.

Message was edited by: Jeff Bailey

TomKari
Onyx | Level 15

Technically, what facilities are being used to host your datamarts? (Are they SAS datasets, OLAP cubes, RDBMS tables...)?

Tom

mmajorza52
Calcite | Level 5

- They are SAS Datasets and SAS DataViews.  They are NOT defined in Metadata.

- Users are using BATCH SAS, Enterprise Guide, SAS/Connect (rsubmit/endrsubmit).

- Parsing logs would be considered, but not from logs from 3000 users.  Is there a more GLOBAL approach to turn on "logging"  on?  The Audit package seems to capture access rights, and capture when tables are updated (tables/columns), not so much from a READ perspective (I may be wrong).  I came across RTRACE last night and still researching.  I'm I wasting my time with this?  Thanks for all the comments.


Kurt_Bremser
Super User

Without metadata, you could only audit table (file) access via the audit subsystem of the operating system (if you have one, I come from AIX) or with the RTRACE/RTRACELOC option. Be aware that (at least in 9.2) the RTRACELOC option in the sasv9.cfg does not support dynamically generated filenames as the LOG option does, so you may have to do that in the script that invokes sas and use the -rtraceloc command line option.

Access of columns could only be audited by working through the logs, IMO.

Tom
Super User Tom
Super User

Not much way to capture VARIABLE (aka column) usage without analyzing the SAS code (or logs). When you write SAS code to read variable X from dataset Y at the operating system level SAS will need to access the whole dataset and read the complete data vector into memory to find variable Y.

AnnaBrown
Community Manager

Great discussion, thanks to everyone for chiming in with ideas.

, you can open a track with Technical Support for more help. And for information on other ways to contact Tech Support, refer to: http://support.sas.com/techsup/contact/index.html.

-Anna


Join us for SAS Community Trivia
SAS Bowl XXIX, The SAS Hackathon
Wednesday, March 8, 2023, at 10 AM ET | #SASBowl

MikeFrost
SAS Employee

Just to add one other comment on this, SAS is currently developing capabilities that will make this sort of distributed auditing of access to SAS data sets simple and automated. This is coming in a future release.

Also, I'd like to mention that our Federation Server product can be leveraged to deliver access control to SAS data sets if security is a concern as well.

TomKari
Onyx | Level 15

That's going to make things very tricky. If only SQL were used to access the data, logging might be able to give you some insight. But if I'm accessing a dataset with a SET statement, SAS has no way of knowing which variables in the dataset that I want to look at.

My feeling is that there's no good solution to this problem. I might suggest opening a case with Tech Support. They have the expertise to give you a definitive answer.

Good luck,

  Tom

jakarman
Barite | Level 11

SAS 9.3 Audit, Performance and Measurement package

Audit stands for knowing what is used......

My comment is from an Eip perspective (metadataserver workspaceserver etc)

---->-- ja karman --<-----
jakarman
Barite | Level 11

As long you are using the EIP approach Workspace servers Connect-services Batch-services you have the option to activate all that logging on the server-side.

It is lesser known area of configuration. See SAS(R) 9.3 Logging: Configuration and Programming Reference (Reports library access audit information.)

The rtrace option could be helpful in determining you have licensed parts that are not uses: http://www2.sas.com/proceedings/sugi30/216-30.pdfhttp://support.sas.com/resources/papers/proceedings14/1877-2014.pdf

I you have control on the autoexec/sas-macro/libname behavior you could build your own logging using arm-statements.

ARM - Frequently Asked Questions - SAS(R) 9.3 Interface to Application Response Measurement (ARM): Reference

The most dirty one of approaches is to replace the libname statement by your own autocall macro statement style. 

---->-- ja karman --<-----
jakarman
Barite | Level 11

@Mike when you are aiming at the APM and Eventmanager evolvements than sorry..

SAS is missing a holistic view on normal market requirement on auditing/monitoring.

If your only object-type would be SAS datasets, that is very limited view on all the information (possible sensitive). Flat files HTML XML any type of unstructured data having metadata.

This kind of limitation would imply that SAS is not being applicable for a Statistical Analyses System. With the big-data hype at his top that would be big misalignment.

For the holistic view I can refer to http://csrc.nist.gov/publications/nistpubs/800-137/SP800-137-Final.pdf.

A difficult situation as this a federal guideline, how would tools that are not able to align with this kind of requirement being used in that kind environments.  

---->-- ja karman --<-----
MikeFrost
SAS Employee

, your point is well taken. I was addressing what I saw the original question on this thread rather than responding about SAS strategy as a whole.

Work is underway to develop the kind of auditing and monitoring that you mention at SAS. It is not work that is limited to SAS datasets and is certainly inclusive of big data environments as well as many others.

Thank you for taking the time to make this comment and allow me to respond more fully.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 14 replies
  • 1985 views
  • 0 likes
  • 9 in conversation