SAS Enterprise Guide

jbrisle · Posted 08-27-2013 10:01 AM

Hi all! This is my first post, so I hope it is a success.

This is what I'm trying to do:

I am working with very large data tables that need to be filtered as much as possible to reduce run-time and avoid spool space limitations. I have two tables, CUSTOMER_COMPLAINTS and TRANSACTION_HISTORY. I first query CUSTOMER_COMPLAINTS for a specific complaint date to get the subset of complaints I would like to research. The unique identifier for each complaint is CUSTOMER_COMPLAINT_ID. I then want to query all transactions from TRANSACTION_HISTORY based on the CUSTOMER_COMPLAINTS_ID field (which is also in the transaction_history) in order to find all related transactions.

Here is where I need help! I am joining CUSTOMER_COMPLAINTS with TRANSACTION_HISTORY via the CUSTOMER_COMPLAINT_ID but the TRANSACTION_HISTORY table does not include any sort of date field to filter on first, so the query has use go through the entire TRANSACTION_HISTORY table (I believe) to find where the two tables match up.

My question is: Is there a way to create a list (or something similar) of CUSTOMER_COMPLAINT_ID's from the initial query that I can use to filter the TRANSACTION_HISTORY table on before joining the two tables? The reason I want to do this is to bring the TRANSACTION_HISTORY table down to a reasonable size before querying it.

Any help in doing this or a better solution to speed this query is greatly appreciated!!!

Thanks!

ChrisHemedinger · Posted 08-27-2013 10:12 AM

If I understand the problem correctly, you should be able to use an inner join on CUSTOMER_COMPLAINT_ID and apply the filter where the date field in CUSTOMER_COMPLAINTS matches the date you want. And if the tables are indexed on the customer ID, that might help things perform even faster.

But, if you feel it needs a two-step solution (query on a query), then you can formulate this as a subquery. See this blog post about subqueries in SAS Enterprise Guide for more details.

Chris

Register for SAS Innovate 2025!! The premier event for SAS users, May 6-9 in Orlando FL. Sign up now for the best deals!

View solution in original post

ChrisHemedinger · Posted 08-27-2013 10:12 AM

If I understand the problem correctly, you should be able to use an inner join on CUSTOMER_COMPLAINT_ID and apply the filter where the date field in CUSTOMER_COMPLAINTS matches the date you want. And if the tables are indexed on the customer ID, that might help things perform even faster.

But, if you feel it needs a two-step solution (query on a query), then you can formulate this as a subquery. See this blog post about subqueries in SAS Enterprise Guide for more details.

Chris

Register for SAS Innovate 2025!! The premier event for SAS users, May 6-9 in Orlando FL. Sign up now for the best deals!

jbrisle · Posted 08-27-2013 10:40 AM

Chris,

Thanks for the swift reply. I'm a bit of a noob, but is there a way to see if the customer ID is indexed?

Thanks

ChrisHemedinger · Posted 08-27-2013 10:49 AM

Try Tasks->Data->Data Attributes and see if anything is revealed in the reports. If your tables are in a database that someone else manages for you, chances are good that this field is indexed, as it would be a common key to filter/sort by.

Chris

Register for SAS Innovate 2025!! The premier event for SAS users, May 6-9 in Orlando FL. Sign up now for the best deals!

jbrisle · Posted 08-27-2013 10:54 AM

Looks like none of the fields are indexed, which is disappointing. Oh well, at least now I know why it is so slow!

TomKari · Posted 08-28-2013 10:52 AM

Depending on the nature of your system, your system administrators could consider one of the new data management products that are optimized for analytics; they tend to perform very quickly on this kind of query, which is not optimal for traditional DBMS products.

Tom

SAS Enterprise Guide

Filtering new query based off data results of another query

Re: Filtering new query based off data results of another query

Re: Filtering new query based off data results of another query

Re: Filtering new query based off data results of another query

Re: Filtering new query based off data results of another query

Re: Filtering new query based off data results of another query

Re: Filtering new query based off data results of another query

Follow Us

What is...

SAS Enterprise Guide

Filtering new query based off data results of another query

Re: Filtering new query based off data results of another query

Re: Filtering new query based off data results of another query

Re: Filtering new query based off data results of another query

Re: Filtering new query based off data results of another query

Re: Filtering new query based off data results of another query

Re: Filtering new query based off data results of another query

Special offer for SAS Communities members

SAS Training: Just a Click Away

Follow Us

What is...