BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Barkamih
Pyrite | Level 9

Hi all 

 

I want to know what is wrong with this code, I'm quite sure this code is correct but it may be the issue because of the big sample size, I have 1.6 million observation. So, What is the best idea to solve this problem. 

when I submit this code, after few minutes the system showing proc means not responding.  

regards 

 

Ibrahim

proc sort data = have;
	by year month week;
	run; quit;

	proc means data = have;
	by year month week;
	output out = want sum(FATKG)= FAT_KG SUM(PROTKG)= PROT_KG SUM(MILKKG)= MILK_KG;
	RUN; QUIT;

  

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

I think @Astounding hit the nail on the head there. If you don't specify NOPRINT or VAR it will analyze all numeric variables and put that in the results window - and that will take time. So adding NOPRINT & a VAR statement should speed things up a lot. 

 

proc sort data = have;
by year month week;
run; 

proc means data = have NOPRINT;
by year month week;
var FATKG PROTKG MILKKG;
output out = want sum(FATKG)= FAT_KG SUM(PROTKG)= PROT_KG SUM(MILKKG)= MILK_KG;
RUN; 

 


@Astounding wrote:

It's not clear how many numeric variables exist in your data set.  Without a VAR statement, the default is to compute statistics for every numeric variable.  PROC MEANS might be "intelligent" enough to figure out that you only need statistics computed for 3 variables ... but it might not figure that out.  You might fix the problem just by adding to PROC MEANS:

 

var fatkg protkg milkkg;


 

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

1.6million observations will take a long time to run, there is no workaround unless you can reduce the number of observations.  Just leave it until its done.  

Reeza
Super User
You may want to suppress the output though - use NOPRINT. Generating that type of output can take a lot of time. If it sorted fine though, I'd expect PROC MEANS to run fine. And 1.6 million rows should complete in under 5 minutes in my experience (SAS UE is still less than a minute for me).

If you have a lot of variables, you can try dropping the variables you don't need in a data set option to make it faster. Not sure if that helps but worth a shot.
Astounding
Opal | Level 21

It's not clear how many numeric variables exist in your data set.  Without a VAR statement, the default is to compute statistics for every numeric variable.  PROC MEANS might be "intelligent" enough to figure out that you only need statistics computed for 3 variables ... but it might not figure that out.  You might fix the problem just by adding to PROC MEANS:

 

var fatkg protkg milkkg;

Reeza
Super User

I think @Astounding hit the nail on the head there. If you don't specify NOPRINT or VAR it will analyze all numeric variables and put that in the results window - and that will take time. So adding NOPRINT & a VAR statement should speed things up a lot. 

 

proc sort data = have;
by year month week;
run; 

proc means data = have NOPRINT;
by year month week;
var FATKG PROTKG MILKKG;
output out = want sum(FATKG)= FAT_KG SUM(PROTKG)= PROT_KG SUM(MILKKG)= MILK_KG;
RUN; 

 


@Astounding wrote:

It's not clear how many numeric variables exist in your data set.  Without a VAR statement, the default is to compute statistics for every numeric variable.  PROC MEANS might be "intelligent" enough to figure out that you only need statistics computed for 3 variables ... but it might not figure that out.  You might fix the problem just by adding to PROC MEANS:

 

var fatkg protkg milkkg;


 

Barkamih
Pyrite | Level 9

 I appreciate your help my  SAS friends HAHAHAH,  now everything is fine and I got what I'm looking for. 

 

thank you very much again 

 

Ibrahim 

 

regards 

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1357 views
  • 5 likes
  • 4 in conversation