03-24-2014 10:54 AM
Could you share your best practices/process for debugging? I'm fairly new to SAS and this would be helpful for me to hear.
Is there a way to ever print a result within a DATA step? My understanding is, no; I'm just wondering how to best visualize with the pdv looks like after any given statement within a datastep
03-24-2014 10:56 AM
I'll choose to not comment on best practices for debugging, although I'm sure you'll get a lot of answers
Yes, of course you can print results within a DATA step. The PUT statement allows you to do this.
03-24-2014 11:01 AM
Like Paige I, too, am not going to comment on best practices for debugging, but just expand on the suggestion that Paige offered.
Either a number of put _all_;
statements, carefully placed in your code, will let you see what is going on in the pdv.
03-24-2014 11:06 AM
Hi Paige and Arthur,
Sincere thanks, still! In using the put statement within a data step, is there a way to subset the dataset if the dataset is too large and/or I want to use it many times?
03-24-2014 11:10 AM
If you want to just check the behaviour of the data step with a few records, look at the (obs=...) data set option.
03-24-2014 11:11 AM
Data set options OBS to limit explicit number or records, FirstObs to select specific record number to start processing and Where to select records with specific content are starting points,
I also recommend not using code like:
until you are real sure you want to replace your start data set.
03-24-2014 11:13 AM
The statement below will limit your processing. This allows you to test your code with a subset of your data. How big is your data that you feel the need to subset?
My comments regarding best practice for coding in general:
1. What works for you.
2. Just because there is no error in your code doesn't mean its correct
3. Check all your recodings by running a proc freq afterwards.
4. Develop iteratively, step by step, but NOT using multiple Data Steps, change the same one as much as possible unless you can't follow what's going on.
5. Pseudo code your code before you type. If you don't understand it in your head, your fingers won't.
03-24-2014 12:00 PM
Thank you very, very much for the guidance on debugging!! By subsetting, I don't mean I think bc the dataset is too large for SAS; it s only about 3000 some obs long but that I may want to look at the data for only a portion of the records
03-24-2014 01:51 PM
Put is a statement that can be executed conditionally.
If var = <value> then put ....;
If you want to really bring attention to a range of values you can provide some additional text and SAS will highlight it for you.
Suppose my variabl X shouldn't exceed 9 but my data entry staff is clumsy:
if x > 9 then put("WARNING: The value for variable " x= "for record: " _n_;
WARNING: NOTE: and ERROR: may yield different appearance in your log depending on the settings in your current SAS theme in preferences.
03-24-2014 11:13 AM
Although with very large datasets, put _all_; may be a huge amount of information written to the listing
With regards to debugging in general, I think the "best practices" depends heavily on the type of bug you have encountered, which is why I hesitate to recommend global best practices ... unless you mean something like when starting to write code, planning ahead to make debugging easy ... and in my mind I think the best way to do that is to write clean and clear code, which takes practice and experience ...
Testing code to see if gives the answer you want is not EXACTLY the same thing as debugging, although there is some overlap between the two
03-24-2014 11:05 AM
One thing to know is the Run Cancel; which will not execute your data step code but will do a syntax check.
Second: Check the LOG and if there are any errors start at the first one. Many types of errors will cause others further in the program. It is amazing how many you can generate with a single missing semicolon ...
When you start getting lured into SAS macro programming, make sure you have base code that works before trying to use macros.
For debugging you may find the Put var= syntax more helpful than a bare put, especially if you have multiple Put statements.
I have done things like:
Put "Before Location X in code" var1= var2= ;
Pur "After Location X in code" var1= var2= ;