Suppose a customer can have multiple customer accounts and purchase_summary table summarizes the purchase totals at customer account level. Not every customer account would have purchase transactions. I use the following query to create a table which includes only customers which had purchase transactions. This query needs to use purchase_summary table twice. I wonder if it is possible to further simplify this query to use purchase_summary once only.
proc sql;
create table a1 as
select a.*, b.purchase_total
from customer a left join purchase_summary b
on a.customer_id = b.customer_id
and a.customer_account = b.customer_account
where a.customer_id in (select customer_id from purchase_summary)
; quit;
If I correctly understand what you want, I don't think your code is doing it. I tried it with the following test files.
Conversely, I think a simple full join does do what you want. Take a look at the following:
/* Some test data */
data customer;
input customer_id customer_account ina;
cards;
1 1 1
1 2 1
2 1 1
3 1 1
4 1 1
;
data purchase_summary;
input customer_id customer_account purchase_total;
cards;
1 1 10
3 1 20
4 1 30
5 1 40
;
/*what you did */
proc sql;
create table a1 as
select a.*, b.purchase_total
from customer a left join purchase_summary b
on a.customer_id = b.customer_id
and a.customer_account = b.customer_account
where a.customer_id in (select customer_id from purchase_summary)
; quit;
/*what I think you want to do*/
proc sql;
create table a2 as
select a.*, b.purchase_total
from customer a, purchase_summary b
having a.customer_id = b.customer_id
and a.customer_account = b.customer_account
;
quit;
Pretty sure I am not following the concept of your data but if you provided a better outline of the intended input and output I and others could probably be more helpful.
data purchase_summary;
input customer_account purchase_summary;
cards;
123 100.00
124 100.50
125 99.99
126 1.00
127 999.00
;
run;
data customer;
input customer_id customer_account;
cards;
1 123
1 124
2 125
2 126
3 127
;
run;
proc sql;
create table total as
select a.customer_id, sum(purchase_summary) as purchase_total
from customer a, purchase_summary b
where a.customer_account=b.customer_account
group by customer_id;
quit;
If you only want the ones that are in both datasets then do not use a LEFT JOIN.
proc sql;
create table a1 as
select a.*, b.purchase_total
from customer a
, purchase_summary b
where a.customer_id = b.customer_id
and a.customer_account = b.customer_account
;
quit;
Or translated into normal SAS programming:
data a1;
merge customer (in=in1) purchase_total(in=in2 keep=customer_id customer_account purchase_total) ;
by customer_id customer_account ;
if in1 and in2;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.