BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
x2PSx
Calcite | Level 5

Hello, I have a table with about 15M obs and I wanna make sure that one variable doesn't have too many missing values. I would like the code to stop after a treshold of about 5k missing values, how do I do that? The process takes about 5,5 min. right now if it reads all  the observations. Thank you!

 

proc sql noprint stimer;
	SELECT
		count(*)
	FROM
		have(keep= variable)
	WHERE
		variable is null
	;
quit;
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

SQL doesn't operate that way. You need at data step and a counter with a STOP to end the data step.

You would count missing but here's another way. 

 

It may be easier to just run PROC FREQ with a missing format and see the total missing in the output. 15 million rows won't take that long to process.

 

 

 

data demo;
set sashelp.class;
retain sum_age;
sum_age = sum(age, sum_age);

if sum_age > 100 then stop;
run;
proc print data=demo;
run;

Here's an example of how to do the proc freq on the full data set at once and see the numbers in a formatted table:

https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111

View solution in original post

2 REPLIES 2
Reeza
Super User

SQL doesn't operate that way. You need at data step and a counter with a STOP to end the data step.

You would count missing but here's another way. 

 

It may be easier to just run PROC FREQ with a missing format and see the total missing in the output. 15 million rows won't take that long to process.

 

 

 

data demo;
set sashelp.class;
retain sum_age;
sum_age = sum(age, sum_age);

if sum_age > 100 then stop;
run;
proc print data=demo;
run;

Here's an example of how to do the proc freq on the full data set at once and see the numbers in a formatted table:

https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111

error_prone
Barite | Level 11
You have to provide more details on the code to be stopped after to many missing values haven been found. Afaik you can't do this in process SQL or any other procedure. In a data-step just retain a counter variable.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 642 views
  • 0 likes
  • 3 in conversation