Those two should be equivalent. Unless you have a lot of duplicate observations that the DISTINCT keyword is eliminating since then there will be fewer observations to transfer.
Personally I wouldn't bother to type the extra characters needed to use the WHERE= dataset option and would instead just use a WHERE statement. The result is the same.
Plus when using SAS code instead of Teradata code you don't have to type all of those commas for the IN operator to work. Spaces will do just fine. That can make it much easier to use a macro variable to contain the list of values.
data V0500_option1;
set teradata.V0500_1_FINANCIAL_EVENT(keep=Agreement_Id Event_Date Start_Date Event_Time Type_Code);
where Event_Date>='01SEP2020'd
and Type_Code IN (11002 11004 11050 11024 31510 11005)
and Agreement_Id ne -1
;
run;
But the real savings will come from doing even more processing in the Teradata server. So what is it that you want to do with this subset of the data? Perhaps you can summarize on the Teradata side and transfer even less information between the two servers.
Note that some procs like PROC SUMMARY will automatically push some of the processing into the Teradata server.
Also note that there is no need to make a NEW connection to Teradata. Instead you re reuse the connection you made for the libref by using the CONNECT USING syntax in your SQL code.
libname mytd teradata ..... ;
proc sql;
connect using mytd;
create table V0500_option2 as
select * from connection to mytd
(select distinct Agreement_Id,Event_Date,Start_Date,Event_Time,Type_Code
from V0500_1_FINANCIAL_EVENT
where Event_Date>= date '2020-09-01'
and Type_Code IN (11002,11004,11050,11024,31510,11005)
and Agreement_Id ne -1
);
quit ;
... View more