Hello, I have a table with about 15M obs and I wanna make sure that one variable doesn't have too many missing values. I would like the code to stop after a treshold of about 5k missing values, how do I do that? The process takes about 5,5 min. right now if it reads all the observations. Thank you!
proc sql noprint stimer;
SELECT
count(*)
FROM
have(keep= variable)
WHERE
variable is null
;
quit;
SQL doesn't operate that way. You need at data step and a counter with a STOP to end the data step.
You would count missing but here's another way.
It may be easier to just run PROC FREQ with a missing format and see the total missing in the output. 15 million rows won't take that long to process.
data demo;
set sashelp.class;
retain sum_age;
sum_age = sum(age, sum_age);
if sum_age > 100 then stop;
run;
proc print data=demo;
run;
Here's an example of how to do the proc freq on the full data set at once and see the numbers in a formatted table:
https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111
SQL doesn't operate that way. You need at data step and a counter with a STOP to end the data step.
You would count missing but here's another way.
It may be easier to just run PROC FREQ with a missing format and see the total missing in the output. 15 million rows won't take that long to process.
data demo;
set sashelp.class;
retain sum_age;
sum_age = sum(age, sum_age);
if sum_age > 100 then stop;
run;
proc print data=demo;
run;
Here's an example of how to do the proc freq on the full data set at once and see the numbers in a formatted table:
https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.