BookmarkSubscribeRSS Feed
Jeff_DOC
Pyrite | Level 9

Good afternoon.

 

I have a folder with a lot of individual SAS code files. I also have multiple libraries with hundreds of tables. What I need to do is to somehow query each individual SAS code file and determine which SAS tables each code file references. Each code file will have multiple referenced tables I'll need to go through each code file to compile an inventory of accessed tables.  If possible I'd love to be able to have the code step through all code members in the folder one after the other.

 

So in the code I have something similar to the following:

 

PROC SQL;

...

FROM XXSOURCE.TABLENAME

...

QUIT;

 

or

 

DATA WANT;

SET XXSOURCE.TABLENAME;

...

RUN;

 

I need to find everything after XXSOURCE. in order to create an inventory of accessed tables for each code file. I've never heard of a way to do this before.

 

Does anyone have a suggestion? Thanks for your help.

6 REPLIES 6
Reeza
Super User

Can you run the code? If you can, PROC SCAPROC can help out. 

 

Is any of the code macros? 

If so, are any of the data sets used there dynamic?

 

Does any of the code use short cut lists, like below?

set data1-data4;

set data:;

What happens when you have data set options used?

set sashelp.class (where sex='F') sashelp.class(where sex='M');

 

The issues I've raised above are why this isn't usually done...you're essentially documenting  a program after the fact.

 

To solve a problem like this, you're writing a program to parse a text file. So look for the word SET and use SCAN() or PRX to pull everything after SET but before the semicolon or open parenthesis. But you'll at minimum have to consider the use cases I've mentioned and I haven't mentioned hash tables or SQL in line queries. 

 

TLDR; This is a non-trivial problem to solve. 


@Jeff_DOC wrote:

Good afternoon.

 

I have a folder with a lot of individual SAS code files. I also have multiple libraries with hundreds of tables. What I need to do is to somehow query each individual SAS code file and determine which SAS tables each code file references. Each code file will have multiple referenced tables I'll need to go through each code file to compile an inventory of accessed tables.  If possible I'd love to be able to have the code step through all code members in the folder one after the other.

 

So in the code I have something similar to the following:

 

PROC SQL;

...

FROM XXSOURCE.TABLENAME

...

QUIT;

 

or

 

DATA WANT;

SET XXSOURCE.TABLENAME;

...

RUN;

 

I need to find everything after XXSOURCE. in order to create an inventory of accessed tables for each code file. I've never heard of a way to do this before.

 

Does anyone have a suggestion? Thanks for your help.


 

andreas_lds
Jade | Level 19
Do you need the datasets in xxsource only, or all datasets? Dou you know all libraries used? Are all libraries assigned using libname-statement? Are you familiar with regular expressions?

As @Reeza said: the solution won't be trivial.
Jeff_DOC
Pyrite | Level 9

Good morning.

 

Thanks for all of the involvement on this from you all.

 

Most (if not all) of the code is pretty straight-forward. (SQL) "from XX" or "set XXX". No macros, etc.

ballardw
Super User

If one is very consistent in coding and doesn't use single level data set names this likely gets a tad easier.

 

 

Subqueries such as:  from (select a, b, from dataset where condition) or similar?

How about use of OUT= dataset creation?

ODS Output?

Temporary libraries that may not be currently defined but use two level names?

 

%Include from files in other directories?

 

Is the code grouped in such a way to expect certain libraries?

 

Jeff_DOC
Pyrite | Level 9

Thanks to all who've taken the time to reply.

 

We have several source library members that are called from within hundreds of code jobs. In essence what we are looking to do is to determine how many times a particular source library member is called over all current code. An inventory of use for members within the library. I can clean up the temporary datasets that are created from each bit of code or I can figure out a way to pull anything with XXX.XXX since we don't generally preface our temporary datasets with anything like WORK. in the call.

 

This is a tough problem especially given I'm a bit new to this. I'm really glad there are people out there who are willing to try to help.

 

Thank you.

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 3565 views
  • 6 likes
  • 5 in conversation