Most efficient way to extract few months data from a large dataset

Accepted Solution Solved
Reply
Contributor
Posts: 40
Accepted Solution

Most efficient way to extract few months data from a large dataset

Hello,

I have a table that records customer Online activities / calls or any other  contacts with my company. The table has around a billion records. I need to run some code everyday that extracts data for last one month. What can be the most efficient / quickest way to extract those records everyday? My approach is to create a sub-set of dataset and then query it by INTCK function:

Data New;

Set Old (where = (Contact_Date is GT "20JUN2012:00:00:00"dt)); /**Here I am reducing the dataset to last two months records so that query is faster**/

Start_Date = INTCK(Month,sysdate(),-2);

Where Start_Date GT Contact_Date;

run;

/*****Ignore if there are any syntax error. I just wanted to show you the logic. ****/

Will it run faster if I create a Start_Date macro in the starting and then call it in Where statement? 


Accepted Solutions
Solution
‎07-15-2012 07:23 PM
Super User
Posts: 3,115

Re: Most efficient way to extract few months data from a large dataset

It would definitely be more efficient to define your data selection perhaps like this. It would also be useful to know if contact_date is indexed and where the most recent rows are in the table as there may be other stratgies for improving performance.

%let start_date = %sysfunc(INTNX(MONTH, %sysfunc(today()) , - 1, SAME), date9.):00:00:00;

%put &start_date ;

Data New;

Set Old (where = (Contact_Date GT "&start_date"dt));

run;

View solution in original post


All Replies
Solution
‎07-15-2012 07:23 PM
Super User
Posts: 3,115

Re: Most efficient way to extract few months data from a large dataset

It would definitely be more efficient to define your data selection perhaps like this. It would also be useful to know if contact_date is indexed and where the most recent rows are in the table as there may be other stratgies for improving performance.

%let start_date = %sysfunc(INTNX(MONTH, %sysfunc(today()) , - 1, SAME), date9.):00:00:00;

%put &start_date ;

Data New;

Set Old (where = (Contact_Date GT "&start_date"dt));

run;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 201 views
  • 1 like
  • 2 in conversation