SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Data quality indicators report

Reply
Contributor
Posts: 34

Data quality indicators report

[ Edited ]

Hi Guys,

 

I hope to get a data quality report which includes the following indicators:

var mean std min max N Q1 median Q3 IQ_Range n_low n_low_percent n_high n_high_percent n_far_low n_far_low_percent n_far_high n_far_high_percent null_rate missing missing_percent

 

Could anyone show me the respective code in SAS proc print, or proc freq, or other useful SAS procedure?

 

Many appreciation. 

Super User
Super User
Posts: 7,392

Re: Data quality indicators report

[ Edited ]
Contributor
Posts: 34

Re: Data quality indicators report

Thank you. I got most of them, but could not get the percentage ones. Could you help me?

Super User
Super User
Posts: 7,392

Re: Data quality indicators report

Without some test data (in the form of a datastep) I can only give generals.  Percents are just count() / N, so you can do these in a datastep.  You may need to proc freq your data to get counts, merge that on.

Contributor
Posts: 34

Re: Data quality indicators report

Hi, my dataset is like this:

IDOPENTIMECLOSETIMEGENDERGRADELOANSFLAG
198121FA1200Y
295115MB1300Y
396114MC1500N
499120FD1600Y
598107FE1700N

 

The following is the code I use:

proc means data=table n mean min max std q1 q3 qrange median nmiss ;
var _numeric_;
run;
proc freq data=table;
tables _character_;
run;

 

But the results for numeric variables do not include the percentages. Could you help me, as I want to get all those results in one report.

Super User
Super User
Posts: 7,392

Re: Data quality indicators report

Sorry, I don't have time to write a whole report for you.  Use those procedures, then merge the required data together, and datastep to calculate any further numbers you need.

Super User
Posts: 10,483

Re: Data quality indicators report

One thing to consider for percentages is what is the numerator and denominator to be used. I don't believe you have specified that in any way clear enough. Likely the way will be to create the appropriate Sums in Proc means/summary and then in a data step calculate the percentages.

 

Or perhaps Proc Report or Tabulate using the data for a report will allow the percentage calculations.

Ask a Question
Discussion stats
  • 6 replies
  • 180 views
  • 0 likes
  • 3 in conversation