I had a problem before, and I got a solution to that. The solution was working fine, and now there is some more problem.
Note: I am opening a new thread on the old with a version number, as I marked the last problem as 'solved' .
Original Problem:
I have some 40 variables, on which I have to perform 10 different data quality (Null, zero value, length, validity of date..) checks. And I am generating one table for each  1Var * 1 DQ check, resulting in 100 tables. Each table will hold just a control total, i.e. count of distinct value for a variable.
Proc sql;
create table DQ_Check_Zero_Var1 as
select count(distinct Var1) as Count_Zero from Source_table;
quit;
Now, I would like to create a master table in this format. Each cell will hold the only value from the corresponding table.
    Stat                     Var1               Var2               Var3             Var4               Var5              ...
 Check_Zero         (number)        (number)       (number)     (number)      (number)
Check_Null
.
.
.
.
To this, @PGStats has provided an appropriate solution. Below is the entire solution he had provided and the original link to the post.
Solution:
I assumed that your datasets have names like DQ_Check_Stat_Var where Stat is the name of the statistic and Var is the name of the variable
 
Proc sql;
create table DQ_Check_N_height as 
select count(height) as Count_Not_Null from SasHelp.class;
quit;
Proc sql;
create table DQ_Check_N_weight as 
select count(weight) as Count_Not_Null from SasHelp.class;
quit;
Proc sql;
create table DQ_Check_Range_height as 
select range(height) as Range from SasHelp.class;
quit;
Proc sql;
create table DQ_Check_Range_weight as 
select range(weight) as Range from SasHelp.class;
quit;
data all;
set DQ_Check_: INDSNAME=ds;
length var Stat $32;
var = propcase(scan(ds, -1, "_"));
stat = propcase(scan(ds, -2, "_"));
value = coalesce(of _numeric_);
keep var stat value;
run;
proc sort data=all; by stat var; run;
proc transpose data=all out=want( drop=_: );
by stat;
id var;
var value;
run;
proc print data=want noobs; run;
New Problem and Error:
My code looks like:
Data DQ_Stat_Var(n)_Pre;
Set Source_Table;
<conditons...>
run;
(Note: Since variable names are quite long, to keep track which var I am working on, I have taken first 6 characters from variable names and attached each with a sequence number generated with _n_ option. There is no underscore between Var and number. I also made sure that no generated dataset names are more 32 characters)
Proc sql;
create table DQ_Stat_Var(n) as
select count(distinct Var) as Cnt from DQ_Stat_Var(n)_Pre;
quit;
After all the datasets are generated, I use this proposed code:
    data all;
       set DQ_Check_: INDSNAME=ds;
       length var Stat $32;
       var = propcase(scan(ds, -1, "_"));
       stat = propcase(scan(ds, -2, "_"));
       value = coalesce(of _numeric_);
       keep var stat value;
   run;
And now I get this error:
Data All;
Set libname.DQ_:INDSNAME=ds;
ERROR: Variable BRANCH_NUMBER has been defined as both character and numeric.
ERROR: Variable BRANCH_NUMBER has been defined as both character and numeric.
ERROR: Variable BRANCH_NUMBER has been defined as both character and numeric.
  length var Stat $32;
       var = propcase(scan(ds, -1, "_"));
       stat = propcase(scan(ds, -2, "_"));
       value = coalesce(of _numeric_);
       keep var stat value;
   run;
Note: I do not have any variable called Branch_Number.
Thank you all in advance.
Agree with @LinusH
Suggest you execute your code up to the data step where you get the Error and then run the following:
proc sql;
  create table find_it as
  select *
  from dictionary.columns
  where 
    libname='<libref>' 
    and memname like 'DQ^_%' escape '^'
    and upcase(name)='BRANCH_NUMBER '
  ;
quit;
  Agree with @LinusH
Suggest you execute your code up to the data step where you get the Error and then run the following:
proc sql;
  create table find_it as
  select *
  from dictionary.columns
  where 
    libname='<libref>' 
    and memname like 'DQ^_%' escape '^'
    and upcase(name)='BRANCH_NUMBER '
  ;
quit;
  Thank you @Patrick and @LinusH
Using your code, I have found the variable BRANCH_NUMBER. It is in two source tables in two different data types, but it is not in the list of variables I am pulling in for data quality check.
Now, I have just two types of output dataset name, i.e. DQ_Stat_Var_Pre and DQ_Stat_Var.
Data DQ_Stat_Var_Pre;
Set Source_Table;
<conditions>;
run;
Proc Sql Noprint;
Create table DQ_Stat_Var as
Select Count(Distinct Var) As Cnt from DQ_Stat_Var_Pre;
Quit;
The variable BRANH_NUMBER only appears in some of the output _Pre tables. Previously when the code was working without error with a smaller subset of variables, I was dropping records when Stat='Pre'. Now, I cannot even run this commented piece of code, as I get the same error.
Data All;
Set libname.DQ_:INDSNAME=ds;
/*
Length var stat $32;
var=propcase(scan(ds, -1, "_"));
stat=propcase(scan(ds, -1, "_"));
value=coalesce(of _numeric_);
keep var stat value;
*/
run;
I can certainly drop the variable BRANCH_NUMBER while pulling data from the source tables. The entire code runs for quite some time, and I have to start from the begining. Is there another way to counter the error with a similar type of INDSNAME operation?
Thank you for your valuable input.
If BRANCH_NUMBER is not required, I would try
Set libname.DQ_: (drop=branch_number) INDSNAME=ds;
Thank you @PGStats.
This is what happens now:
Data All;
Set libname.DQ_: (drop=branch_number) INDSNAME=ds;
ERROR: The variable BRANCH_NUMBER in the DROP, KEEP or RENAME list has never been referenced.
<rest of the code>
And the original error remains when I remove the Drop option.
If I get it right, you don't want to read in the _pre datasets when creating dataset all. So, create your summary datasets with a different prefix and read only those when creating all.
Citing @LinusH
"I think that you now are paying the price for working with wide data - it drives syntax more than narrow/long table designs."
It can happen that we take a wrong turn and it's sometimes worth to just re-start instead of trying to continue on the path we've taken. Normally when you re-think the problem with an open mind and all the experience already gathered easier solution arise.
I feel you're at the point where you should re-think and re-start.
I haven't understood in full yet what you're trying to achieve. I believe it would help if you could describe as clear as possible what you have and what you need - and why.
Please provide some sample data (a data step creating such data) for the have and then describe what you need (eventually also providing some data for illustration purposes). Let us also know the source data volumes you're dealing with.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
