DATA Step, Macro, Functions and more

analyse SAS programs programmatically

Reply
N/A
Posts: 0

analyse SAS programs programmatically

Hi, there. If you have a SAS program (macro instruction+datasteps+procedures) you may want to analyse it programmaticaly. In other words, do anybody knows any tools (SAS preferably) allowing to translate SAS code to well defined data structures? Thanx in advance.
N/A
Posts: 0

Re: analyse SAS programs programmatically

Posted in reply to deleted_user
none free
N/A
Posts: 0

Re: analyse SAS programs programmatically

Posted in reply to deleted_user
Hi, Peter. I have found only SAS-Parser by Michael Friendly. What tools you may recommend?
N/A
Posts: 0

Re: analyse SAS programs programmatically

Posted in reply to deleted_user
gent999

with some success I wrote my own SAS Log analyser.
As it has taken several years of development and many occasional improvements I consider it has too much time invested in it to release as free-ware. Each time it is appropriate to use, I find I need to adapt it to the customer site.
It has been acclaimed in one-off situations as well as where it was capturing process information from over 100+ jobs of an ETL job stream (that was not a macro-based solution - just one data step to analyse all the logs in the stream). It has been adapted for my three main SAS platforms: windows/unix/mainframe.
With very few exceptions, it collects process information only from NOTEs in the SAS log ~ not from SAS syntax. The timing information reports not only for how long a step runs, but also when. That "time-when" allowed the exact step in a SAS process to be identified that coincided with a major performance problem on its SAS server. It was a proc sort on that occasion. Each step type is identified by its proc name or "DATA". Elapsed "time-when" does not account for parallel processing through SAS mpConnect (yet).
As a SAS log does not report SQL NOTEs like a data step, my log analyser is unable to report how many rows are read from each of the sources contributing to SQL joins. However, both data steps and proc sql report the number of ROWS and COLUMNS written to each table/data set. For each data set read in a data step, the number of observations read, is reported.
The number of records read from, and written to, external files are reported. I was stunned how complex that became when filename concatenation and the FILEVAR= infile option refer to many external files.
From the NOTE information provided by SOURCE2, the %include-ed code is identified along with its nesting level.
All of these items of information are associated with the line of the SAS log at which the information is found.

The extract provides a wealth of information for SAS process re-engineering.

Perhaps my log analyser will become redundant with the information flowing out of the SAS92 SCAPROC feature.


PeterC
"a good consultant is worth even more than his fee"
Frequent Contributor
Posts: 126

Re: analyse SAS programs programmatically

Posted in reply to deleted_user

Hey Guys,

I am curious. Smiley Happy

If you are still following this thread, would you be willing to share some links, code or just a little more insight?

Are you using Perl or even SAS itself to analyse the code and/or logs?

Thanks in advance,

Michael

Valued Guide
Posts: 2,177

Re: analyse SAS programs programmatically

I was using that long-winded SAS data step.

However, with the introduction of proc scaproc, I have not used my log analyser in the last 5 years, so perhaps it has become redundant.

I think the best (free) parser is (obviously I suppose) the SAS System.

It might be worth creating an "idea" (aka SASware ballot item), requesting that a new option (like proc scaproc) makes all these data available in a table or (more-easily parsed) xml

peterC

Super User
Posts: 3,257

Re: analyse SAS programs programmatically

Posted in reply to deleted_user

Apparently Enterprise Guide 6.1 has some new features for analysing SAS logs, including SAS logs produced outside of EG:

http://bi-notes.com/2013/07/sas-enterprise-guide-6-1-log-summary-feature/

As for analysing the programs themselves I would say that would be a very complex task that would be near impossible to cover all coding possibilities.

I am curious to know why you want to do this. What do you want this analysis to tell you?

Valued Guide
Posts: 2,177

Re: analyse SAS programs programmatically

that's a nice feature in EG6.1

getting a response from @GENT999 seems unlikely now as that was the only posting from that account, still in the forum history.

However, other postings have pointed to the value of monitoring production processes over the longer term as one beneficiary of an extract of the stats within notes about obs read and written. Extract of run/cpu times indexed by time-stamp helped diagnose a server performance problem.

It is not just for speeding up problem-solving of code-under-development where log analyses provide support. But that seems to be the target of this feature in EG6.1

regards

peterC

Frequent Contributor
Posts: 126

Re: analyse SAS programs programmatically

Thanks for your responses!

I stumbled over this topic and it seemed interesting for several reasons.

We have a lot of SAS-Programs running in batch. While we do have the logs written to our server, it is quite annoying to search for errors and warnings, should they happen or be of any interest.

One time, we purged almost all the DATA MERGE Steps from our programs and replaced them with SQL, due to the 'repeats of BY values' which caused incorrect joins. In this case I used the linux command line to find all logs with this note in the log, output them to a file and then used Excel to track the programs and the progress of rewriting the programs.

Some log analyzer might have been handy there.

Also the already mentioned tracking of run times, written records, etc. would be quite fancy. Especially if you don't need to rewrite all programms to output the information. I would rather just like to decide if I want to analyze a few logs and then load them into a parser or SAS-program of some sort and have my ad hoc analysis. :-)

As for the included log analyzer in EG6.1, I can say that it works well with programs running from EG, but not with existing log-files from programs in batch mode. http://bi-notes.com/2013/07/sas-enterprise-guide-6-1-log-summary-feature/ seems to be incorrect when mentioning external log-files. At least my EG installation will just treat them as txt-files and perform no highlighting, whatsoever.

Thanks again for all your responses!

Cheers,

Michael

Super User
Posts: 3,257

Re: analyse SAS programs programmatically

Posted in reply to deleted_user

Yes I fully understand why you may want to analyse SAS logs for performance analysis. I was wondered what the objective would be for just analysing SAS programs themselves - to document the SAS datasets used for example or something else? 

Frequent Contributor
Posts: 126

Re: analyse SAS programs programmatically

Ah, I see. Well, analysing programs (i.e. program code) would probably provide a nice overview, when opening old batch programs. One might see which tables are used and which tables are the output. Probably even a program flow with joins and transformations.

That should be fairly scriptable with some parsing language.

Proc xxx refers to a transformation. The following "data=" would point to input tables. Following "out=" would point to output tables, etc.

Same with Data xxx, Set xxx and Merge xxx.

However, you start with a small script, but before you notice it, you spend some hours tweaking and optimizing it. So if someone had already invested some time and written similar scripts, that'd be great. That is why I asked in this thread. ;-)

Now I know, there is the Data Integration Studio, which does that and there are other solutions, like the dataflux studio. But that does not help you, if you are currently not using these tools and have a load of preexisting batch programs that you need to look into every now and then.

I have been drawing flowcharts of programs in the past, but it would be way smoother, if some little tool would do the job. You know, IT people are lazy sometimes - and by lazy I mean 'looking for a simple smart solution :-)

Super User
Posts: 19,851

Re: analyse SAS programs programmatically

I can Transpose data using proc sql, a single data step, multiple datasteps, or proc transpose Smiley Happy

Not sure how any "scanner" would pick that up Smiley Happy

Frequent Contributor
Posts: 126

Re: analyse SAS programs programmatically

Hi Reeza,

that is true, but I would rather be interested in the program-flow itself than in detailed steps.

I mean, you could add a data step, that adds +5 to every measure you use. I am sure, you could write a script that would pick these things up, but you would end up with pretty much your whole program code as a result ;-)

So I don't want to see the transformations done to the measures in detail, but rather the source and output of the measures. The 'data-flow', if I may call it that.

Something like this would be great:

- Program starts with a join/merge of tables A and B.

- Resulting table AB is then transformed somehow (either via proc xxx or via data step without further details) into table AB_2.

- Lastly AB_2 is joined/merged with table C to the output table RESULT.

Now you might imagine a similar analysis just from the program code and a more detailed analysis from the log, where observations of the tables and run times are added.

I would find that rather appealing ;-)

Since there does not seem to exist such a rudimentary solution, I will look further into this, if I can find some spare time ;-)

Super User
Posts: 19,851

Re: analyse SAS programs programmatically

Frequent Contributor
Posts: 126

Re: analyse SAS programs programmatically

Hi Reeza,

Thank you for your help. I remember we looked into that feature and I was just looking into it again.

What can I say, for a small to medium sized program I receive the following output:

I can harldy work with that, since it divides the whole programs into different steps. That might be very handy when it comes to analyse ETL or other steps in detail. However I would rather like to see a brief overview of the process/program. I believe a textual description would provide a much cleaner and quicker overview, since I find it quite hard to follow all the connecting lines in the eGuide process view.

So instead of dividing the program into its different steps, I am thinking of boiling it down to just the table-names and probably the kind of steps that are performed.

But then again this might just be a matter of me being picky or a matter of personal taste ;-)

Thanks again!

Michael

Ask a Question
Discussion stats
  • 16 replies
  • 711 views
  • 0 likes
  • 5 in conversation