Hi,
Data-warehouse contains small to very large datasets with over 225 different variables in each data sets. Business requirement says there is only one type of error is expected and that is there would be a missing or blank variable value of any variable in a row, or in many rows, and if found the entire row/rows must be deleted. The catch is the blank could occur in any of the 225 variables and in any row,
is there a function or logic in datstep to do it or what do you guys think about Proc Datasets?
It's an appending table monthly, quarterly and yearly, therefore the preference is to use automated simple solutions.
And if you have a mixture of numeric and string values combine tge cmiss with Nmiss something like
Append base= collection data= new_rows(
Where=( NOT sum( 0, cmiss( of _character_), nmiss( of _numeric_) ) ) ) ;
Run ;
In a data step, you could use the CMISS function, to find a row with any missing values.
Then, delete the row and write the table back to the data warehouse
So the way I read this was that if ANY of the rows in your dataset have 1 blank value at all, then drop the entire row. Is that correct, or did you mean the entire row has to be blank in order for it to be dropped?
I guess I wil just give you examples of both ways to be safe...
First: If any columns have a blank variable, then drop that entire row.
data new;
set old;
array varsyay(*) firstcolumn--lastcolumn;
if cmiss(of cvars(*))=0;
run;
Now drop the row ONLY if the entire row is blank.
data new;
set old;
array varsyay(*) firstcolumn--lastcolumn;
if not(cmiss(of varsyay(*)) eq dim(varsyay(*));
run;
And if you have a mixture of numeric and string values combine tge cmiss with Nmiss something like
Append base= collection data= new_rows(
Where=( NOT sum( 0, cmiss( of _character_), nmiss( of _numeric_) ) ) ) ;
Run ;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.