Hi all! This is my first post, so I hope it is a success.
This is what I'm trying to do:
I am working with very large data tables that need to be filtered as much as possible to reduce run-time and avoid spool space limitations. I have two tables, CUSTOMER_COMPLAINTS and TRANSACTION_HISTORY. I first query CUSTOMER_COMPLAINTS for a specific complaint date to get the subset of complaints I would like to research. The unique identifier for each complaint is CUSTOMER_COMPLAINT_ID. I then want to query all transactions from TRANSACTION_HISTORY based on the CUSTOMER_COMPLAINTS_ID field (which is also in the transaction_history) in order to find all related transactions.
Here is where I need help! I am joining CUSTOMER_COMPLAINTS with TRANSACTION_HISTORY via the CUSTOMER_COMPLAINT_ID but the TRANSACTION_HISTORY table does not include any sort of date field to filter on first, so the query has use go through the entire TRANSACTION_HISTORY table (I believe) to find where the two tables match up.
My question is: Is there a way to create a list (or something similar) of CUSTOMER_COMPLAINT_ID's from the initial query that I can use to filter the TRANSACTION_HISTORY table on before joining the two tables? The reason I want to do this is to bring the TRANSACTION_HISTORY table down to a reasonable size before querying it.
Any help in doing this or a better solution to speed this query is greatly appreciated!!!
Thanks!
If I understand the problem correctly, you should be able to use an inner join on CUSTOMER_COMPLAINT_ID and apply the filter where the date field in CUSTOMER_COMPLAINTS matches the date you want. And if the tables are indexed on the customer ID, that might help things perform even faster.
But, if you feel it needs a two-step solution (query on a query), then you can formulate this as a subquery. See this blog post about subqueries in SAS Enterprise Guide for more details.
Chris
If I understand the problem correctly, you should be able to use an inner join on CUSTOMER_COMPLAINT_ID and apply the filter where the date field in CUSTOMER_COMPLAINTS matches the date you want. And if the tables are indexed on the customer ID, that might help things perform even faster.
But, if you feel it needs a two-step solution (query on a query), then you can formulate this as a subquery. See this blog post about subqueries in SAS Enterprise Guide for more details.
Chris
Chris,
Thanks for the swift reply. I'm a bit of a noob, but is there a way to see if the customer ID is indexed?
Thanks
Try Tasks->Data->Data Attributes and see if anything is revealed in the reports. If your tables are in a database that someone else manages for you, chances are good that this field is indexed, as it would be a common key to filter/sort by.
Chris
Looks like none of the fields are indexed, which is disappointing. Oh well, at least now I know why it is so slow!
Depending on the nature of your system, your system administrators could consider one of the new data management products that are optimized for analytics; they tend to perform very quickly on this kind of query, which is not optimal for traditional DBMS products.
Tom
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.