BookmarkSubscribeRSS Feed
VVDR
Obsidian | Level 7

 

Hi,

I have a table with huge volume of data (approx 500 million records) and I am using SQL Join transformation ( and doing aggregate functions, group by and having in clause). I have also created Index(on the columns that are used in the having clause) on the output table( by right clicking on the temporary output table). 

But it is taking ages to exeute. Can someone help me here. 

 

Thanks 

5 REPLIES 5
jimbarbour
Meteorite | Level 14

First, how many columns are we talking about here?  If you've got columns that are not necessary to calculations or the final results, eliminate them from the SELECT.

 

Second, can you post the code?  It's pretty hard to guess at what you're doing here.

 

Jim

VVDR
Obsidian | Level 7

Hi @jimbarbour 

I have only 13 columns after deleting the unwanted columns. 

Sample Code:

Block1:

Proc sql;

   Create table T1 as select * from Original

    where date>2010 and itemcategory not in ('A','B','C');

   quit;

Block2:

Proc sql;

  Create table T2 as select name, area, designation, sum(salary) as Totalsum 

from T1;

Quit;

Block3:

Proc Sql;

    Create table T3 as select * from T2

   where Totalsum>1000;

Quit;

The above one is the sample code. 

Please suggest.

Thanks

andreas_lds
Jade | Level 19

@VVDR wrote:

Hi @jimbarbour 

I have only 13 columns after deleting the unwanted columns. 

Sample Code:

Block1:

Proc sql;

   Create table T1 as select * from Original

    where date>2010 and itemcategory not in ('A','B','C');

   quit;

Block2:

Proc sql;

  Create table T2 as select name, area, designation, sum(salary) as Totalsum 

from T1;

Quit;

Block3:

Proc Sql;

    Create table T3 as select * from T2

   where Totalsum>1000;

Quit;

The above one is the sample code. 

Please suggest.

Thanks


In the code you have posted only datasets in work are used. Please don't post a simplified version of the code you want to be optimized.

Kurt_Bremser
Super User

This code does not make sense. In block 2, you create the sum of salary over the whole dataset and attach it to every single observation of the dataset. So if that sum is larger than 1000, you will still get all observations in block 3; if it's smaller, you get none.

Please describe what you actually want to achieve.

andreas_lds
Jade | Level 19

Without seeing the code and the log it is hardly possible to suggest something useful.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 873 views
  • 2 likes
  • 4 in conversation