04-04-2017 11:14 AM
The database that I am working with records lab values as character variables beacuse lab techs sometimes write notes/comments instead of enter a numeric lab results.
I have been converting this variable to numeric by adding a 0 to it (i.e. character_var+0) ...this seems to do the trick, but am wondering if there are any downsides to doing this. I am only interested in the numeric measurements so other than ending up with missing fields where there were notes/comments are there any downsides to doing this?
04-04-2017 11:21 AM
This is known as implicit type conversion, basically having SAS automatically convert a character to a numeric and you'll likely have a note in the log that this is happening. It is normally considered bad programming practice and should be avoided. The normal character to numeric conversion in SAS is done through the INPUT function. If you know the format of the numeric lab values, you can use INPUT(lab value,informat.). If it is somewhat variable (sometimes 4 characters, sometimes 13, etc.) you can use the BEST. informat. References for the INPUT function and the BEST informat are available in SAS documentation if you want more background.
04-04-2017 11:31 AM
04-04-2017 11:43 AM - edited 04-04-2017 11:47 AM
unfortunately that is not an option, I have no ways of inputting the values
(or are just unaware of how it could be done).
The recommendation was not that you input the values. The recommendation was that you use the SAS function INPUT instead of adding zero to the value.
04-04-2017 11:31 AM
Some organizations with strict code management policies will require "clean" log results meaning no errors, warnings and sometimes even no notes. This process may cause a violation of that policy.
Personally if that were my data I would probably address this sort of issue at the data read step and either create to variables, If the notes were needed later or read as numeric to begin with and suppress the resulting "invalid data" messages that are going to insue.
04-04-2017 11:46 AM
I can only agree with the other posters. This uses implicit (i.e. your not specifying it, your letting the system guess it) conversion. Always a bad technique. Always make sure you - the person closest to it - is in complete control. Use the input() function.
04-04-2017 12:02 PM - edited 04-04-2017 12:04 PM
While I generally agree, here are some tools to help cope with the situation.
numval = input(charval, ??20.);
This will convert the existing values to their numeric equivalent, if possible. However, adding ?? will suppress messages about invalid data if the original set of characters are not numeric.
For cleaning the data, or perhaps being more rigorous about what can be converted and what can't, you could try:
proc freq data=labdata;
where charval > ' ' and input(charval, ??20.) = .;
This will give you a table of all the values that can't be converted, so you can inspect them and see if there is something you might be able to do with them.