BookmarkSubscribeRSS Feed
VVDR
Obsidian | Level 7

 

Hi,

I have a table with huge volume of data (approx 500 million records) and I am using SQL Join transformation ( and doing aggregate functions, group by and having in clause). I have also created Index(on the columns that are used in the having clause) on the output table( by right clicking on the temporary output table). 

But it is taking ages to exeute. Can someone help me here. 

 

Thanks 

5 REPLIES 5
jimbarbour
Meteorite | Level 14

First, how many columns are we talking about here?  If you've got columns that are not necessary to calculations or the final results, eliminate them from the SELECT.

 

Second, can you post the code?  It's pretty hard to guess at what you're doing here.

 

Jim

VVDR
Obsidian | Level 7

Hi @jimbarbour 

I have only 13 columns after deleting the unwanted columns. 

Sample Code:

Block1:

Proc sql;

   Create table T1 as select * from Original

    where date>2010 and itemcategory not in ('A','B','C');

   quit;

Block2:

Proc sql;

  Create table T2 as select name, area, designation, sum(salary) as Totalsum 

from T1;

Quit;

Block3:

Proc Sql;

    Create table T3 as select * from T2

   where Totalsum>1000;

Quit;

The above one is the sample code. 

Please suggest.

Thanks

andreas_lds
Jade | Level 19

@VVDR wrote:

Hi @jimbarbour 

I have only 13 columns after deleting the unwanted columns. 

Sample Code:

Block1:

Proc sql;

   Create table T1 as select * from Original

    where date>2010 and itemcategory not in ('A','B','C');

   quit;

Block2:

Proc sql;

  Create table T2 as select name, area, designation, sum(salary) as Totalsum 

from T1;

Quit;

Block3:

Proc Sql;

    Create table T3 as select * from T2

   where Totalsum>1000;

Quit;

The above one is the sample code. 

Please suggest.

Thanks


In the code you have posted only datasets in work are used. Please don't post a simplified version of the code you want to be optimized.

Kurt_Bremser
Super User

This code does not make sense. In block 2, you create the sum of salary over the whole dataset and attach it to every single observation of the dataset. So if that sum is larger than 1000, you will still get all observations in block 3; if it's smaller, you get none.

Please describe what you actually want to achieve.

andreas_lds
Jade | Level 19

Without seeing the code and the log it is hardly possible to suggest something useful.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1527 views
  • 2 likes
  • 4 in conversation