Hello, I have a table with about 15M obs and I wanna make sure that one variable doesn't have too many missing values. I would like the code to stop after a treshold of about 5k missing values, how do I do that? The process takes about 5,5 min. right now if it reads all the observations. Thank you!
proc sql noprint stimer;
SELECT
count(*)
FROM
have(keep= variable)
WHERE
variable is null
;
quit;
SQL doesn't operate that way. You need at data step and a counter with a STOP to end the data step.
You would count missing but here's another way.
It may be easier to just run PROC FREQ with a missing format and see the total missing in the output. 15 million rows won't take that long to process.
data demo;
set sashelp.class;
retain sum_age;
sum_age = sum(age, sum_age);
if sum_age > 100 then stop;
run;
proc print data=demo;
run;
Here's an example of how to do the proc freq on the full data set at once and see the numbers in a formatted table:
https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111
SQL doesn't operate that way. You need at data step and a counter with a STOP to end the data step.
You would count missing but here's another way.
It may be easier to just run PROC FREQ with a missing format and see the total missing in the output. 15 million rows won't take that long to process.
data demo;
set sashelp.class;
retain sum_age;
sum_age = sum(age, sum_age);
if sum_age > 100 then stop;
run;
proc print data=demo;
run;
Here's an example of how to do the proc freq on the full data set at once and see the numbers in a formatted table:
https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.