Suppose a customer can have multiple customer accounts and purchase_summary table summarizes the purchase totals at customer account level. Not every customer account would have purchase transactions. I use the following query to create a table which includes only customers which had purchase transactions. This query needs to use purchase_summary table twice. I wonder if it is possible to further simplify this query to use purchase_summary once only.
proc sql;
create table a1 as
select a.*, b.purchase_total
from customer a left join purchase_summary b
on a.customer_id = b.customer_id
and a.customer_account = b.customer_account
where a.customer_id in (select customer_id from purchase_summary)
; quit;
If I correctly understand what you want, I don't think your code is doing it. I tried it with the following test files.
Conversely, I think a simple full join does do what you want. Take a look at the following:
/* Some test data */
data customer;
input customer_id customer_account ina;
cards;
1 1 1
1 2 1
2 1 1
3 1 1
4 1 1
;
data purchase_summary;
input customer_id customer_account purchase_total;
cards;
1 1 10
3 1 20
4 1 30
5 1 40
;
/*what you did */
proc sql;
create table a1 as
select a.*, b.purchase_total
from customer a left join purchase_summary b
on a.customer_id = b.customer_id
and a.customer_account = b.customer_account
where a.customer_id in (select customer_id from purchase_summary)
; quit;
/*what I think you want to do*/
proc sql;
create table a2 as
select a.*, b.purchase_total
from customer a, purchase_summary b
having a.customer_id = b.customer_id
and a.customer_account = b.customer_account
;
quit;
Pretty sure I am not following the concept of your data but if you provided a better outline of the intended input and output I and others could probably be more helpful.
data purchase_summary;
input customer_account purchase_summary;
cards;
123 100.00
124 100.50
125 99.99
126 1.00
127 999.00
;
run;
data customer;
input customer_id customer_account;
cards;
1 123
1 124
2 125
2 126
3 127
;
run;
proc sql;
create table total as
select a.customer_id, sum(purchase_summary) as purchase_total
from customer a, purchase_summary b
where a.customer_account=b.customer_account
group by customer_id;
quit;
If you only want the ones that are in both datasets then do not use a LEFT JOIN.
proc sql;
create table a1 as
select a.*, b.purchase_total
from customer a
, purchase_summary b
where a.customer_id = b.customer_id
and a.customer_account = b.customer_account
;
quit;
Or translated into normal SAS programming:
data a1;
merge customer (in=in1) purchase_total(in=in2 keep=customer_id customer_account purchase_total) ;
by customer_id customer_account ;
if in1 and in2;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.