10-30-2012 09:18 PM
I have a Big and huge datasets....When i am running Proc Summary and Proc Means on those datasets...
It is taking lot of time to generate the output results..Is there an other way that can faster the SAS processing time ..
And can reduce SAS execution time...Any sample code will helps..
10-31-2012 11:04 AM
You could try and do that just at the proc means step, you don't need to create a second dataset and yes you can use a where and keep clause together if required.
I also like to use the noprint so that output doesn't go the window though I doubt that adds significant time savings.
proc means data=abc(keep=counter where=(code=1)) nmiss noprint;
output out=def sum(counter) = sum_x;
10-30-2012 09:24 PM
How many variables are you calculating means on and have you used the keep option on your data statement to only keep the relevant variables? It would also help if you posted the code you have run.
10-30-2012 09:30 PM
I need to perform sum of only one variable using Proc Means...
Code is like below:
proc means data=abc nmiss;
output out=def sum(counter) = sum_x;
abc is a huge dataset..
10-30-2012 10:38 PM
If your data set already exists, there is very little you can do at that point. Here are a couple of things to think about.
The DATA step that constructs ABC could easily compute the sum of COUNTER at the same time.
If ABC will be processed multiple times, experiment with creating a narrow subset. Create a separate data set holding just the variables needed for analysis, and use that narrow subset in PROC MEANS and the other analysis procedures. The expense of creating the extra data set might be offset by the analysis procedures each running somewhat faster.
10-31-2012 10:36 AM
1. How many records are you trying to process?
2. How many variables are in the dataset?
3. Is your input dataset a SAS dataset, or on a different file management / DBMS system? If so, which one?
4. If your input dataset is a SAS dataset, is it on a local disk to your SAS processing machine, or are you pulling the data over a network?
The first question is whether the slowdown is in PROC MEANS, or somewhere else. The answers to these questions should help.
10-31-2012 10:57 AM
No of records are around 2 million.
I have only one variable which needs to be summed out.
Input dataset is a SAS dataset...
Slowdown is in Proc Means...
10-31-2012 12:07 PM
When I quickly generate 2 million random records with 6 variables, and use your precise code, PROC MEANS runs in less thana second, which is very much in line with my experience. So there's no point worrying about doing something different about the summarization.
However, a dataset with 2 million rows and 890 variables would take up 14 gigabytes, assuming all numeric. This is definitely going to take some time to run through your computer's disk I/O system. Just copying a dataset that size on my machine took 28 minutes. Here's the code I used:
create table def as select * from abc;
You can try it out on your processor to see what happens.
I think that your best option is to try to reduce the number of columns for processing in general. If all you need is the one PROC MEANS, just let it run, it'll finish someday.
10-31-2012 01:02 PM
11-01-2012 02:21 PM
Hi Tom Kari, Arthur,Larry Reeza, Richard,
Thanks for your valuable comments. Will implement the same in my code..
I have one more scenario.. Pls see it below
How to perform many to many megrge using Hashing.....?
We need to perform merge on two tables A and B...
Table B needs to be loaded into memory....I need to implement three scenarios:
If A=1 and B=1 then
We can implement this using RC=0
If A=1 and B=0 then
We can implement this using RC<>0
If A=0 and B=1 then
What about this scenario...How to implement this scenario(Basically Right Join) using Hashing...
10-31-2012 07:24 PM
It was an interesting idea, but no joy. Without a view, 0:04:11, with, 0:04:01.
Bhoopesh, seeing as I'm able to do this tabulation in a few minutes with a not particularly fast PC:
1. How long is your PROC MEANS run taking?
2. What kind of computer are you running it on?
P.S. to Art: How do you do that "business card" type reference to another user? It's really cool!
10-31-2012 07:42 PM
Need further help from the community? Please ask a new question.