DATA Step, Macro, Functions and more

Finding empty values and stopping after finding more than the threshold

Accepted Solution Solved
Reply
Contributor
Posts: 26
Accepted Solution

Finding empty values and stopping after finding more than the threshold

Hello, I have a table with about 15M obs and I wanna make sure that one variable doesn't have too many missing values. I would like the code to stop after a treshold of about 5k missing values, how do I do that? The process takes about 5,5 min. right now if it reads all  the observations. Thank you!

 

proc sql noprint stimer;
	SELECT
		count(*)
	FROM
		have(keep= variable)
	WHERE
		variable is null
	;
quit;

Accepted Solutions
Solution
‎11-28-2017 02:07 PM
Super User
Posts: 23,306

Re: Finding empty values and stopping after finding more than the threshold

SQL doesn't operate that way. You need at data step and a counter with a STOP to end the data step.

You would count missing but here's another way. 

 

It may be easier to just run PROC FREQ with a missing format and see the total missing in the output. 15 million rows won't take that long to process.

 

 

 

data demo;
set sashelp.class;
retain sum_age;
sum_age = sum(age, sum_age);

if sum_age > 100 then stop;
run;
proc print data=demo;
run;

Here's an example of how to do the proc freq on the full data set at once and see the numbers in a formatted table:

https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111

View solution in original post


All Replies
Solution
‎11-28-2017 02:07 PM
Super User
Posts: 23,306

Re: Finding empty values and stopping after finding more than the threshold

SQL doesn't operate that way. You need at data step and a counter with a STOP to end the data step.

You would count missing but here's another way. 

 

It may be easier to just run PROC FREQ with a missing format and see the total missing in the output. 15 million rows won't take that long to process.

 

 

 

data demo;
set sashelp.class;
retain sum_age;
sum_age = sum(age, sum_age);

if sum_age > 100 then stop;
run;
proc print data=demo;
run;

Here's an example of how to do the proc freq on the full data set at once and see the numbers in a formatted table:

https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111

Regular Contributor
Posts: 202

Re: Finding empty values and stopping after finding more than the threshold

You have to provide more details on the code to be stopped after to many missing values haven been found. Afaik you can't do this in process SQL or any other procedure. In a data-step just retain a counter variable.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 86 views
  • 0 likes
  • 3 in conversation