BookmarkSubscribeRSS Feed
tomcmacdonald
Quartz | Level 8

 It has worked in the past and now suddenly my code is broken. I'm performing a join between ~20 tables with around 1-50k rows each.  All tables have an index and are sorted.  The join is performed using PROC SQL.  It's requiring hundreds of GB's before taking up all available resources and throwing an error once everything is exhausted.  I'm trying to see if there's a cartesian product being mistakenly created and there are none.  What are my options?  Must I refactor this to be done with PROC SORT's and DATA steps?

6 REPLIES 6
SuryaKiran
Meteorite | Level 14

Did you check some of the basic performance considerations for your query like WHERE clause, KEEP or DROP, indexed datasets. 

Thanks,
Suryakiran
tomcmacdonald
Quartz | Level 8

How would I go about doing that?  For this query there is no WHERE clause.  And isn't KEEP and DROP specific to the DATA step and not PROC SQL? 

SuryaKiran
Meteorite | Level 14

You can use KEEP or DROP or SELECT Clause in your query for keeping only the columns that you need.

Keep drop can also be used in proc sql

proc sql;
select *
from sashelp.class(keep= Name Sex);
quit;

 

Also another consideration is avoid heterogeneous joins if possible. Where are your tables located? and share your sample query.   

Thanks,
Suryakiran
tomcmacdonald
Quartz | Level 8
proc sql;
  create table report as 
  select
    a.id,
    a.col1,
    a.col2,
    b.col4
    case when
        c.col5 is not null then 1
    end as col5
    ...
    z.col30
  from
    a
    left join b on a.id = b.id
    ...
    left join z on a.id = z.id;
quit; 

The query looks something like that.  Some tables are in work and some are in other libraries. 

tomcmacdonald
Quartz | Level 8

OK, I figured it out, this was my mistake.  It was forming a cartesian product because I forgot to throw out rows with a NULL variable that it was being joined on.  Thanks for the help everyone.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 911 views
  • 0 likes
  • 3 in conversation