Solved
Contributor
Posts: 34

# Alternate of Proc Summary and Proc Means..

I have a Big and huge datasets....When i am running Proc Summary and Proc Means on those datasets...

It is taking lot of time to generate the output results..Is there an other way that can faster the SAS processing time ..

And can reduce SAS execution time...Any sample code will helps..

Accepted Solutions
Solution
‎10-31-2012 11:04 AM
Super User
Posts: 23,663

## Re: Alternate of Proc Summary and Proc Means..

You could try and do that just at the proc means step, you don't need to create a second dataset and yes you can use a where and keep clause together if required.

I also like to use the noprint so that output doesn't go the window though I doubt that adds significant time savings.

proc means data=abc(keep=counter where=(code=1)) nmiss noprint;

var counter;

output out=def sum(counter) = sum_x;

run;

All Replies
PROC Star
Posts: 8,163

## Re: Alternate of Proc Summary and Proc Means..

How many variables are you calculating means on and have you used the keep option on your data statement to only keep the relevant variables?  It would also help if you posted the code you have run.

Contributor
Posts: 34

## Re: Alternate of Proc Summary and Proc Means..

I need to perform sum of only one variable using Proc Means...

Code is like below:

proc means data=abc nmiss;

var counter;

output out=def sum(counter) = sum_x;

run;

abc is a huge dataset..

Super User
Posts: 6,751

## Re: Alternate of Proc Summary and Proc Means..

If your data set already exists, there is very little you can do at that point.  Here are a couple of things to think about.

The DATA step that constructs ABC could easily compute the sum of COUNTER at the same time.

If ABC will be processed multiple times, experiment with creating a narrow subset.  Create a separate data set holding just the variables needed for analysis, and use that narrow subset in PROC MEANS and the other analysis procedures.  The expense of creating the extra data set might be offset by the analysis procedures each running somewhat faster.

Good luck.

PROC Star
Posts: 1,307

## Re: Alternate of Proc Summary and Proc Means..

1. How many records are you trying to process?

2. How many variables are in the dataset?

3. Is your input dataset a SAS dataset, or on a different file management / DBMS system? If so, which one?

4. If your input dataset is a SAS dataset, is it on a local disk to your SAS processing machine, or are you pulling the data over a network?

The first question is whether the slowdown is in PROC MEANS, or somewhere else. The answers to these questions should help.

Tom

Contributor
Posts: 34

## Re: Alternate of Proc Summary and Proc Means..

Hi Tom,

No of records are around 2 million.

I have only one variable which needs to be summed out.

Input dataset is a SAS dataset...

Slowdown is in Proc Means...

PROC Star
Posts: 1,307

## Re: Alternate of Proc Summary and Proc Means..

Hi, Bhoopesh

When I quickly generate 2 million random records with 6 variables, and use your precise code, PROC MEANS runs in less thana second, which is very much in line with my experience. So there's no point worrying about doing something different about the summarization.

However, a dataset with 2 million rows and 890 variables would take up 14 gigabytes, assuming all numeric. This is definitely going to take some time to run through your computer's disk I/O system. Just copying a dataset that size on my machine took 28 minutes. Here's the code I used:

proc sql;

create table def as select * from abc;

quit;

You can try it out on your processor to see what happens.

I think that your best option is to try to reduce the number of columns for processing in general. If all you need is the one PROC MEANS, just let it run, it'll finish someday.

PROC Star
Posts: 8,163

## Re: Alternate of Proc Summary and Proc Means..

As I am not a programmer, per se, there is one type of SAS file I haven't had much experience with, namely a view.  Is there a chance that creating a view of the desired file, but only including the relevant field, might speed up subsequent processing of an otherwise large file?

PROC Star
Posts: 1,307

## Re: Alternate of Proc Summary and Proc Means..

My gut says no, but I'm not positive. I'll test it, and report back.

Tom

Contributor
Posts: 34

## Re: Alternate of Proc Summary and Proc Means..

Hi Tom Kari, Arthur,Larry Reeza, Richard,

Thanks for your valuable comments. Will implement the same in my code..

I have one more scenario.. Pls see it below

How to perform many to many megrge using Hashing.....?

We need to perform merge on two tables A and B...

Table B needs to be loaded into memory....I need to implement three scenarios:

If A=1 and B=1 then

...........

.........

We can implement this using RC=0

If A=1 and B=0 then

.......

We can implement this using RC<>0

If A=0 and B=1 then

..............

PROC Star
Posts: 1,307

## Re: Alternate of Proc Summary and Proc Means..

It was an interesting idea, but no joy. Without a view, 0:04:11, with, 0:04:01.

Bhoopesh, seeing as I'm able to do this tabulation in a few minutes with a not particularly fast PC:

1. How long is your PROC MEANS run taking?

2. What kind of computer are you running it on?

Tom

P.S. to Art: How do you do that "business card" type reference to another user? It's really cool!

PROC Star
Posts: 8,163

## Re: Alternate of Proc Summary and Proc Means..

:  I'm not really sure what you mean by "business card" reference.  Most of the time, when I respond to someone, I first type an @, then type their screen name, and when it appears, click on it.  Does it provide the "business card" reference you are talking about?

PROC Star
Posts: 1,307

## Re: Alternate of Proc Summary and Proc Means..

Testing...I think this is exactly what I meant!

Thanks,

Tom

Posts: 3,167

## Re: Alternate of Proc Summary and Proc Means..

Apparently I am not the only one who is amazed by your trick.

Haikuo

PROC Star
Posts: 1,307

## Re: Alternate of Proc Summary and Proc Means..

Did I mention how happy I am to know how to do this?!?

🔒 This topic is solved and locked.