Hello,
I need to write a SAS Code to be able read through another SAS code and get a list of all the variables used. Is that possible and if so then how?
Please suggest, thanks!
Go to lexjansen.com and search for parsers. You should also take a look at PROC SCAPROC.
If your programs have macros this is infinitely more difficult.
Concur with @Reeza. As soon as you use macro elements, resolving those will become a nightmare.
And variables are often "used" in SAS by not using them in the code. A simple set statement will put all variables in a dataset into the PDV without any of them being explicitly named. So you would need to resolve the whole contents of the dataset to determine the variables present in the data step.
Not too mention indirect references such as variable lists, especially ones like var: _numeric_ _character_ and _all_
Nothing is impossible, but it will take time. Like "see you in 2050".
Tools like that (think SCAPROC) are developed by teams over years.
A partial solution may be to start from the other end: Get variables from the datasets of interest (dictionary.columns or sashelp.vcolumn)
Then search the code files for references to those specific variables. Any comparison should be made in a case-insensitive manner as your programmers are likely to have ThisVarName, thisvarname, Thisvarname, ThisvarName and such in the code.
Note: I hope you don't have any variables that match SAS supplied functions or keywords such as DATE, MAX, MIN, MEAN and so forth as differentiating between Variable, Function, Proc name, and procedure options.
Or take a class on compiler construction in a computer science course to understand more of the issues surrounding parsing computer code.
@ballardw wrote:
A partial solution may be to start from the other end: Get variables from the datasets of interest (dictionary.columns or sashelp.vcolumn)
Then search the code files for references to those specific variables. Any comparison should be made in a case-insensitive manner as your programmers are likely to have ThisVarName, thisvarname, Thisvarname, ThisvarName and such in the code.
Note: I hope you don't have any variables that match SAS supplied functions or keywords such as DATE, MAX, MIN, MEAN and so forth as differentiating between Variable, Function, Proc name, and procedure options.
Or take a class on compiler construction in a computer science course to understand more of the issues surrounding parsing computer code.
I use max/min/mean all the time 😉
It's going to be an interesting exercise. It's nice in theory, try to determine what variables are being used. But this is basically where human knowledge of processes is important. Assuming that technology can solve a lack of documentation is a flawed idea.
SAS has a procedure that can analyze SAS programs as you run them -- it's called PROC SCAPROC.
This procedure generates a detailed log file with data set names and variable names and attributes -- any data that the program reads and writes is captured.
If you use SAS Enterprise Guide, you can use Program->Analyze->Analyze for Program Flow to see an example of the detailed output. Try it with a simple program, allow it to create a process flow, and then look at the Note item that the process adds to the flow. Here's an excerpt:
Data Sets (Count=2) SASHELP.CARS Type=DATA Field count=15 Access=INPUT Mode=SEQUENTIAL Path='C:\Program Files\SASHome\SASFoundation\9.4\core\sashelp' Make Type=CHARACTER Length=13 Model Type=CHARACTER Length=40 Type Type=CHARACTER Length=8 Origin Type=CHARACTER Length=6 DriveTrain Type=CHARACTER Length=5 MSRP Type=NUMERIC Length=8 Format=DOLLAR8. Invoice Type=NUMERIC Length=8 Format=DOLLAR8. EngineSize Type=NUMERIC Length=8 Label="Engine Size (L)"
Also check Sample 58047: Parse output from PROC SCAPROC to create a data set with inputs and outputs -- a sample that provides an example program that can parse the output of SCAPROC.
This approach captures all data variables that are read and written, even if your program does not explicitly name them.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.