Hello together,
I implemented the following sql statement (once with "union all" & once with "(full) outer join") with the conditions that turnover of 3 years shall be displayed in three parallel coulumns from the same table in SAS dwh. Es works with "union all" correctly (respectively that data are not on the same level and moreover moved down).
With outer join: it seems as follows:
//OUTPUT
t1.PTN_NR t1.Cust_ID t1.Name_of_Cust Value_Year_1 Value_Year_2 Value_Year_3
100000001 891450001 Schmid 1,50 2,00 10,50
...
SELECT t1.PTN_NR,
t1.Cust_ID,
t1.Name_of_Cust,
sum(t1.Comp_Val) AS Value_Year_1,
sum(t2.Comp_Val) AS Value_Year_2,
sum(t3.Comp_Val) AS Value_Year_3
From T_Cust_Val_Cas t1
FULL OUTER JOIN T_Cust_Val_Cas t2 ON t2.Cust_ID = t1.Cust_ID
FULL OUTER JOIN T_Cust_Val_Cas t3 ON t3.Cust_ID = t2.Cust_ID
where /* date-format is dd.mm.yyyy*/
(t1.V_Date_Gen >= intnx('year',today(),-1,'beginning') AND t1.V_Date_Gen < intnx('year',today(),0,'beginning'))
OR
(t2.V_Date_Gen >= intnx('year',today(),-2,'beginning') AND t1.V_Date_Gen < intnx('year',today(),-1,'beginning'))
OR
(t3.V_Date_Gen >= intnx('year',today(),-3,'beginning') AND t1.V_Date_Gen < intnx('year',today(),-2,'beginning'))
GROUP BY
t1.PTN_NR,
t1.Cust_ID,
t1.Name_of_Cust
;
My quesion: how should I tune this query for better performance directly in SQL?
Thank you in advande.
Hi,
After some testing on a simplified example, I realize that a full join is not going to give you the result you expect. You will be much better with left joins :
proc sql constdatetime;
select t0.Ptn_Nr, t0.Cust_ID, t0.Name_of_cust,
Value_Year_1, Value_Year_2, Value_Year_3
from
(select unique Ptn_Nr, Cust_ID, Name_of_cust
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) between 1 and 3) t0
left join
(select Ptn_Nr, Cust_ID, sum(Comp_Val) as Value_Year_1
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) = 1
group by Ptn_Nr, Cust_ID, Name_of_cust) t1
ON t0.Cust_ID=t1.Cust_ID and t0.Ptn_Nr=t1.Ptn_Nr
left join
(select Ptn_Nr, Cust_ID, sum(Comp_Val) as Value_Year_2
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) = 2
group by Ptn_Nr, Cust_ID) t2
ON t0.Cust_ID=t2.Cust_ID and t0.Ptn_Nr=t2.Ptn_Nr
left join
(select Ptn_Nr, Cust_ID, sum(Comp_Val) as Value_Year_3
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) = 3
group by Ptn_Nr, Cust_ID) t3
ON t0.Cust_ID=t3.Cust_ID and t0.Ptn_Nr=t3.Ptn_Nr;
quit;
PG
Try this:
proc sql constdatetime;
select * from
(select Ptn_Nr, Cust_ID, Name_of_cust, sum(Comp_Val) as Value_Year_1
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) = 1
group by Ptn_Nr, Cust_ID, Name_of_cust)
natural full join
(select Ptn_Nr, Cust_ID, Name_of_cust, sum(Comp_Val) as Value_Year_2
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) = 2
group by Ptn_Nr, Cust_ID, Name_of_cust)
natural full join
(select Ptn_Nr, Cust_ID, Name_of_cust, sum(Comp_Val) as Value_Year_3
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) = 3
group by Ptn_Nr, Cust_ID, Name_of_cust);
quit;
(untested)
PG
Hello PG,
Thank you for your help. I tried your syntax and got an error that expecting one of the following: a name, AS, ON.
Any idea?
Thank you.
BR
I can't guess where my syntax is wrong... Please post more detail. - PG
Hello PG,
due to the error as described above, I changed your syntax based on Oracle sql that contained some modifications as follows:
proc sql constdatetime;
select t1.Ptn_Nr, t1.Cust_ID, t1.Name_of_cust, t1.VY1 AS Value_Year_1, t2.VY2 AS Value_Year_2, t3.VY3 AS Value_Year_3 from
(select Ptn_Nr, Cust_ID, Name_of_cust, sum(Comp_Val) as VY1
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) = 1
group by Ptn_Nr, Cust_ID, Name_of_cust) t1
full join
(select Ptn_Nr, Cust_ID, Name_of_cust, sum(Comp_Val) as VY2
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) = 2
group by Ptn_Nr, Cust_ID, Name_of_cust) t2
full join
(select Ptn_Nr, Cust_ID, Name_of_cust, sum(Comp_Val) as VY3
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) = 3
group by Ptn_Nr, Cust_ID, Name_of_cust) t3
ON t1.Cust_ID=t2.Cust_ID and t1.Cust_ID=t3.Cust_ID;
quit;
It works primary. But I'm not really sure that this (modified) syntax does not cause any side effects later. What is your intention?
Best regards
Hi,
After some testing on a simplified example, I realize that a full join is not going to give you the result you expect. You will be much better with left joins :
proc sql constdatetime;
select t0.Ptn_Nr, t0.Cust_ID, t0.Name_of_cust,
Value_Year_1, Value_Year_2, Value_Year_3
from
(select unique Ptn_Nr, Cust_ID, Name_of_cust
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) between 1 and 3) t0
left join
(select Ptn_Nr, Cust_ID, sum(Comp_Val) as Value_Year_1
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) = 1
group by Ptn_Nr, Cust_ID, Name_of_cust) t1
ON t0.Cust_ID=t1.Cust_ID and t0.Ptn_Nr=t1.Ptn_Nr
left join
(select Ptn_Nr, Cust_ID, sum(Comp_Val) as Value_Year_2
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) = 2
group by Ptn_Nr, Cust_ID) t2
ON t0.Cust_ID=t2.Cust_ID and t0.Ptn_Nr=t2.Ptn_Nr
left join
(select Ptn_Nr, Cust_ID, sum(Comp_Val) as Value_Year_3
from T_Cust_Val_Cas
where intck("YEAR", V_Date_Gen, Today()) = 3
group by Ptn_Nr, Cust_ID) t3
ON t0.Cust_ID=t3.Cust_ID and t0.Ptn_Nr=t3.Ptn_Nr;
quit;
PG
Hi PG,
Thank you very much for help. It works fantastic.
Kindes regards
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.