Hello, I have been ask a task I never done before and to be honest, I don't know how to handle this problem.
Here's the challenge.
A SAS Dataset is provided as well as an XSD file where the type (num, string, date) of each variable is defined, the number of values we can tolerate (ex: nb_value= 0 then missing value for this variable could be acceptable, nb_value = 1 then no missing value are acceptable, min and max value that a variable could take and so on.
The idea is to use those criteria to validate the sas dataset. If the dataset is validated then we can go through the transformation process.
otherwise, an error report (a SAS dataset containing for all the variables, the observations which deviate from criteria) .
As I never did that task before, I am open to any suggestion how could we carry out this task?
Without seeing the XSD file I'm not sure if there is any quick and easy way to this. If the structure is clean enough you might be able to build a number of custom formats to display valid/invalid messages. Here's brief example:
Proc format library=work; value $validsex 'M','F'= 'Valid' '',' ' = 'Unexpected Missing' other = 'Unexpected value' ; run; data example; set sashelp.class; if name in ('Alfred' 'Judy') then sex=''; if name in ('Louise' 'Mary' 'Philip') then sex='A'; run; proc freq data=example; tables sex /missing; format sex $validsex.; run;
However if the value of one variable relies on another then you will have more coding as you would have to take that into account. Which gets into potentially a lot of code depending on complex the relationships may be.