I had a problem before, and I got a solution to that. The solution was working fine, and now there is some more problem.
Note: I am opening a new thread on the old with a version number, as I marked the last problem as 'solved' .
Original Problem:
I have some 40 variables, on which I have to perform 10 different data quality (Null, zero value, length, validity of date..) checks. And I am generating one table for each 1Var * 1 DQ check, resulting in 100 tables. Each table will hold just a control total, i.e. count of distinct value for a variable. Proc sql; create table DQ_Check_Zero_Var1 as select count(distinct Var1) as Count_Zero from Source_table; quit; Now, I would like to create a master table in this format. Each cell will hold the only value from the corresponding table. Stat Var1 Var2 Var3 Var4 Var5 ... Check_Zero (number) (number) (number) (number) (number)
Check_Null
.
.
.
.
To this, @PGStats has provided an appropriate solution. Below is the entire solution he had provided and the original link to the post.
(Link - https://communities.sas.com/t5/General-SAS-Programming/Building-master-table-with-input-from-multiple-tables/m-p/229597#M33708)
Solution:
I assumed that your datasets have names like DQ_Check_Stat_Var where Stat is the name of the statistic and Var is the name of the variable Proc sql; create table DQ_Check_N_height as select count(height) as Count_Not_Null from SasHelp.class; quit; Proc sql; create table DQ_Check_N_weight as select count(weight) as Count_Not_Null from SasHelp.class; quit; Proc sql; create table DQ_Check_Range_height as select range(height) as Range from SasHelp.class; quit; Proc sql; create table DQ_Check_Range_weight as select range(weight) as Range from SasHelp.class; quit; data all; set DQ_Check_: INDSNAME=ds; length var Stat $32; var = propcase(scan(ds, -1, "_")); stat = propcase(scan(ds, -2, "_")); value = coalesce(of _numeric_); keep var stat value; run; proc sort data=all; by stat var; run; proc transpose data=all out=want( drop=_: ); by stat; id var; var value; run; proc print data=want noobs; run;
New Problem and Error:
My code looks like:
Data DQ_Stat_Var(n)_Pre;
Set Source_Table;
<conditons...>
run;
(Note: Since variable names are quite long, to keep track which var I am working on, I have taken first 6 characters from variable names and attached each with a sequence number generated with _n_ option. There is no underscore between Var and number. I also made sure that no generated dataset names are more 32 characters)
Proc sql; create table DQ_Stat_Var(n) as select count(distinct Var) as Cnt from DQ_Stat_Var(n)_Pre; quit;
After all the datasets are generated, I use this proposed code:
data all; set DQ_Check_: INDSNAME=ds; length var Stat $32; var = propcase(scan(ds, -1, "_")); stat = propcase(scan(ds, -2, "_")); value = coalesce(of _numeric_); keep var stat value; run;
And now I get this error:
Data All;
Set libname.DQ_:INDSNAME=ds;
ERROR: Variable BRANCH_NUMBER has been defined as both character and numeric.
ERROR: Variable BRANCH_NUMBER has been defined as both character and numeric.
ERROR: Variable BRANCH_NUMBER has been defined as both character and numeric.
length var Stat $32; var = propcase(scan(ds, -1, "_")); stat = propcase(scan(ds, -2, "_")); value = coalesce(of _numeric_); keep var stat value; run;
Note: I do not have any variable called Branch_Number.
Thank you all in advance.
... View more