BookmarkSubscribeRSS Feed
Nasser_DRMCP
Lapis Lazuli | Level 10

Hello,

I have a table t1 that holds 3 millions rows (with one datetime) and a table t2 that holds 5 millions rows (with 2 datetime)

In a proc sql, i have a join like this

select ...

FROM t1 AS a LEFT JOIN t2 AS b

ON a.IDT_RSS=b.IDT_RSS

AND a.TSP_CVL BETWEEN b.TSP_TEST_DEB AND b.TSP_TEST_FIN

;

cpu times takes  23 minutes. How coul I optimize ?

thanks a lot in advance

nasser

 

3 REPLIES 3
Kurt_Bremser
Super User

Tell us a little more about your data.

  • are the IDT_RSS unique in t1, or do you have multiple entries per IDT_RSS?
  • same for t2?
  • if IDT_RSS is non-unique in t2, do the time ranges overlap, or is each TSP_TEST_DEB larger than the preceding TSP_TEST_FIN (if the dataset is sorted by IDT_RSS and TSP_TEST_DEB)?

 

Nasser_DRMCP
Lapis Lazuli | Level 10

Hello Kurt

thanks a lot for your respons and sorry for answering in late.

we succeeded to solve the problem by reducing the scope.

instead of collect and treat all rows each times, we treat only one month each month.

I mean we treat only missing data in target (versus source),  instead of remove and replace.

Thanks a lot

regards

Nasser

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 970 views
  • 2 likes
  • 2 in conversation