Hi All,
Recently I took a SQL query that I was running in Teradata SQL Assistant and wrote it in SAS using Proc SQL. My intend was to get the same results in SAS as I was getting the script in Teradata SQL Assistant. The Proc SQL code works, however I'm getting slightly different results and I'm not sure why. Does anyone know what I might be missing? Thanks!
Teradata SQL Assistant code:
Select
PERIOD_DT as "Date"
,APPL
,CUST_TYPE_IND
,SUBPROD_CODE
,left(NAICS_Code,2) as "NAIC 2 Digit"
,CUOV_BUSN_TYPE
,sum(CUR_BAL) as "Balance"
From eiw_d_bicc_hgnids_v1.depsme_jv IDS
left join CIS.CUSTOMER_TB Cust
on IDS.CIS_CUST_NO = Cust.Cust_NBR
and cust.Close_date is null
Left Join CIS.Cust_OV_TB CUOV
On CUST.Cust_CO_NBR = CUOV.Cust_CO_NBR
and CUST.CUST_ID = CUOV.CUST_ID
AND CUST.CUST_TIE_BRKR = CUOV.CUST_TIE_BRKR
left join PCD.NAICS_TB NAIC
on Cust.NAIC_CD = NAIC.NAICS_Code
where IDS.PERIOD_DT = '2021-05-31'
and IDS.CUR_BAL >0
and CUST_TYPE_IND <> 0
and IDS.CUR_BAL <1000000
and NAIC.NAICS_Code not in (
'521110', '522110', '522120', '522130', '522190', '522210')
group by
1,2,3,4,5,6
SAS Proc SQL code:
data _null_;
call Execute('%include "/users/apps/rbr_rpa/&sysuserid/pw/teradata_credentials.sas " / source2;');
run;
libname teralib1 teradata server='XXXXX' user=&tduser password=&pw database='eiw_d_bicc_hgnids_v1';
libname teralib2 teradata server='XXXXX' user=&tduser password=&pw database='CIS';
libname teralib3 teradata server='XXXXX' user=&tduser password=&pw database='PCD' mode=teradata;
proc sql;
create table small_business as
select
depsme_jv.PERIOD_DT as Date,
depsme_jv.APPL,
CUSTOMER_TB.CUST_TYPE_IND,
depsme_jv.SUBPROD_CODE,
substr(naics_tb.NAICS_Code,1,2) as NAIC_2_Digit,
cust_ov_tb.CUOV_BUSN_TYPE,
sum(depsme_jv.CUR_BAL) as Balance
from teralib1.depsme_jv
left join teralib2.customer_tb
on depsme_jv.cis_cust_no = customer_tb.Cust_NBR and customer_tb.close_date is null
left join teralib2.cust_ov_tb
on customer_tb.Cust_CO_NBR = cust_ov_tb.Cust_CO_NBR
and customer_tb.CUST_ID = cust_ov_tb.CUST_ID
and customer_tb.CUST_TIE_BRKR = cust_ov_tb.CUST_TIE_BRKR
left join teralib3.naics_tb
on customer_tb.naic_cd = naics_tb.naics_code
where depsme_jv.PERIOD_DT='31May2021'd
and depsme_jv.cur_bal > 0
and CUSTOMER_TB.cust_type_ind ~="0"
and CUSTOMER_TB.cust_type_ind is not null
and depsme_jv.cur_bal < 1000000
and naics_tb.naics_code not in('521110', '522110', '522120', '522130', '522190', '522210')
and naics_tb.naics_code is not null
group by 1,2,3,4,5,6;
quit;
libname teralib1 clear;
libname teralib2 clear;
libname teralib3 clear;
and CUSTOMER_TB.cust_type_ind ~="0"
and CUSTOMER_TB.cust_type_ind is not null
and CUSTOMER_TB.cust_type_ind ne "0"
and not missing(CUSTOMER_TB.cust_type_ind)
Try replacing those two lines with the ones below that's my first guess after a quick scan of the code. And how far off are you?
If you generate the query without the summarization does it return the same number of rows?
@aamoen wrote:
Hi All,
Recently I took a SQL query that I was running in Teradata SQL Assistant and wrote it in SAS using Proc SQL. My intend was to get the same results in SAS as I was getting the script in Teradata SQL Assistant. The Proc SQL code works, however I'm getting slightly different results and I'm not sure why. Does anyone know what I might be missing? Thanks!
Teradata SQL Assistant code:
Select
PERIOD_DT as "Date"
,APPL
,CUST_TYPE_IND
,SUBPROD_CODE
,left(NAICS_Code,2) as "NAIC 2 Digit"
,CUOV_BUSN_TYPE
,sum(CUR_BAL) as "Balance"
From eiw_d_bicc_hgnids_v1.depsme_jv IDS
left join CIS.CUSTOMER_TB Cust
on IDS.CIS_CUST_NO = Cust.Cust_NBR
and cust.Close_date is nullLeft Join CIS.Cust_OV_TB CUOV
On CUST.Cust_CO_NBR = CUOV.Cust_CO_NBR
and CUST.CUST_ID = CUOV.CUST_ID
AND CUST.CUST_TIE_BRKR = CUOV.CUST_TIE_BRKRleft join PCD.NAICS_TB NAIC
on Cust.NAIC_CD = NAIC.NAICS_Codewhere IDS.PERIOD_DT = '2021-05-31'
and IDS.CUR_BAL >0
and CUST_TYPE_IND <> 0
and IDS.CUR_BAL <1000000
and NAIC.NAICS_Code not in (
'521110', '522110', '522120', '522130', '522190', '522210')
group by
1,2,3,4,5,6
SAS Proc SQL code:
data _null_;
call Execute('%include "/users/apps/rbr_rpa/&sysuserid/pw/teradata_credentials.sas " / source2;');
run;libname teralib1 teradata server='XXXXX' user=&tduser password=&pw database='eiw_d_bicc_hgnids_v1';
libname teralib2 teradata server='XXXXX' user=&tduser password=&pw database='CIS';
libname teralib3 teradata server='XXXXX' user=&tduser password=&pw database='PCD' mode=teradata;proc sql;
create table small_business as
select
depsme_jv.PERIOD_DT as Date,
depsme_jv.APPL,
CUSTOMER_TB.CUST_TYPE_IND,
depsme_jv.SUBPROD_CODE,
substr(naics_tb.NAICS_Code,1,2) as NAIC_2_Digit,
cust_ov_tb.CUOV_BUSN_TYPE,
sum(depsme_jv.CUR_BAL) as Balancefrom teralib1.depsme_jv
left join teralib2.customer_tb
on depsme_jv.cis_cust_no = customer_tb.Cust_NBR and customer_tb.close_date is nullleft join teralib2.cust_ov_tb
on customer_tb.Cust_CO_NBR = cust_ov_tb.Cust_CO_NBR
and customer_tb.CUST_ID = cust_ov_tb.CUST_ID
and customer_tb.CUST_TIE_BRKR = cust_ov_tb.CUST_TIE_BRKRleft join teralib3.naics_tb
on customer_tb.naic_cd = naics_tb.naics_codewhere depsme_jv.PERIOD_DT='31May2021'd
and depsme_jv.cur_bal > 0
and CUSTOMER_TB.cust_type_ind ~="0"
and CUSTOMER_TB.cust_type_ind is not null
and depsme_jv.cur_bal < 1000000
and naics_tb.naics_code not in('521110', '522110', '522120', '522130', '522190', '522210')
and naics_tb.naics_code is not null
group by 1,2,3,4,5,6;quit;
libname teralib1 clear;
libname teralib2 clear;
libname teralib3 clear;
Interestingly enough, the total balance of the output is the same. But Proc SQL is bringing back fewer rows. 6,224 rows from the Teradata SQL Assistant output versus 6,203 rows from Proc SQL. So it seems that the grouping is working differently between the two.
By default Teradata ignores case when comparing strings.
Do you know if any of the character strings have mixed case?
I suggest use Proc compare to compare the two outputs.
That will help understand what is going on.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.