I have 4 different tables with a certain metric in each table, and 3 columns in each table, that I am attempting to join on.
the data is as follows:
table 1
Region | Center | Assoc | Metric 1 | Descriptor 1
a 1 1 z
c 3 3 z
d 4 4 z
table 2
Region | Center | Assoc | Metric 2 | Descriptor 2
a 1 1 t
b 2 2 t
c 3 3 t
table 3
Region | Center | Assoc | Metric 3 | Descriptor 3
a 1 1 y
b 2 2 y
c 3 3 y
d 4 4 y
table 4
Region | Center | Assoc | Metric 4 | Descriptor 4
a 1 1 x
d 4 4 x
the result that I am looking for would have all regions, centers, and associates listed whether or not they are in 1 or all of the tables. and all the metrics are filled in, even if its missing. The important part is that each region, center, and associate that is in any of the 4 tables be shown in the resulting table.
here's what I'm looking for:
Region | Center | Assoc | Metric 1 | Descriptor 1 | Metric 2 | Descriptor 2 | Metric 3 | Descriptor 3 | Metric 4 | Descriptor 4
a 1 1 z t y x
b 2 2 . t y .
c 3 3 z t y .
d 4 4 z . y x
I have tried a full join as seen below, and
FROM WORK.BC_ACTIVATION_OFFICERS t1
FULL OUTER JOIN WORK.BC_CLOSEDIN90_OFFICERS t2 ON (t1.'Region No'n = t2.'Region No'n) AND (t1.'Center No'n = t2.'Center No'n) AND (t1.Officer = t2.Officer)
FULL OUTER JOIN WORK.CLOSED_W_NEW_WITHIN_30D_OFFICERS t3 ON (t1.'Region No'n = t3.'Region No'n) AND (t1.'Center No'n = t3.'Center No'n) AND (t1.Officer = t3.'New Acct Officer'n)
FULL OUTER JOIN WORK.BC_APPS_BY_OFFICER t4 ON (t1.'Region No'n = t4.'Region No'n) AND (t1.'Center No'n = t4.'Center No'n) AND (t1.Officer = t4.Officer)
and I get missing regions and centers and officers. I have checked the tables individually, and there are no missing regions, centers or associates in any of them
Ideas?
Thanks,
El
Can you show the expected output you would want from that sample input?
If all the files have the same variables, usually the data is stacked, not merged, so to understand your desired outcome would help.
@EIrvin wrote:
I have 4 different tables with a certain metric in each table, and 3 columns in each table, that I am attempting to join on.
the data is as follows:
table 1
Region | Center | Assoc | Metric 1 | Descriptor 1
a 1 1 z
c 3 3 z
d 4 4 z
table 2
Region | Center | Assoc | Metric 2 | Descriptor 2
a 1 1 t
b 2 2 t
c 3 3 t
table 3
Region | Center | Assoc | Metric 3 | Descriptor 3
a 1 1 y
b 2 2 y
c 3 3 y
d 4 4 y
table 4
Region | Center | Assoc | Metric 4 | Descriptor 4
a 1 1 x
d 4 4 x
the result that I am looking for would have all regions, centers, and associates listed whether or not they are in 1 or all of the tables. and all the metrics are filled in, even if its missing. The important part is that each region, center, and associate that is in any of the 4 tables be shown in the resulting table.
here's what I'm looking for:
Region | Center | Assoc | Metric 1 | Descriptor 1 | Metric 2 | Descriptor 2 | Metric 3 | Descriptor 3 | Metric 4 | Descriptor 4
a 1 1 z t y x
b 2 2 . t y .
c 3 3 z t y .
d 4 4 z . y x
I have tried a full join as seen below, and
FROM WORK.BC_ACTIVATION_OFFICERS t1 FULL OUTER JOIN WORK.BC_CLOSEDIN90_OFFICERS t2 ON (t1.'Region No'n = t2.'Region No'n) AND (t1.'Center No'n = t2.'Center No'n) AND (t1.Officer = t2.Officer) FULL OUTER JOIN WORK.CLOSED_W_NEW_WITHIN_30D_OFFICERS t3 ON (t1.'Region No'n = t3.'Region No'n) AND (t1.'Center No'n = t3.'Center No'n) AND (t1.Officer = t3.'New Acct Officer'n) FULL OUTER JOIN WORK.BC_APPS_BY_OFFICER t4 ON (t1.'Region No'n = t4.'Region No'n) AND (t1.'Center No'n = t4.'Center No'n) AND (t1.Officer = t4.Officer)
and I get missing regions and centers and officers. I have checked the tables individually, and there are no missing regions, centers or associates in any of them
Ideas?
Thanks,
El
I was in the process of editing, just that, when you responded. It should be out there now
and all the metrics are filled in, even if its missing
How does that work when there's no data in the tables?
It looks like you're trying to recreate a data from a star schema structure/fact+dim tables. Is that a correct assumption?
I'm not sure I know what a star schema is. but because the metrics were so acutely focused, I had to create tables just for those metrics, however, my audience wants to see them all rolled up together. But I can't go to a mast list of all associates and centers and regions, because that master list would be too large. so I was attempting to capture those regions, centers, and associates that appeared in any one of those tables, and show all metrics in a single line for each of the region/center/associates ... if that makes any sense
You don't show which fields you are selecting, but presumably the key fields (Region, Center, Assoc) are selected from just one of the tables. However ...
Remember, a full outer join takes records from both tables. There can be mismatches. If you are selecting Region, Center, and Assoc from the first table, it is possible that some Regions appear in the second table only. Those records get selected by a full outer join, but there is no Region value found in the first table.
Are you able to use MERGE instead of PROC SQL?
SQL is the only language I know. how does merge work in these situations?
Assuming you have four SAS data sets, sort each of them:
by region center assoc;
Then combine them:
data want;
merge table1 table2 table3 table4;
by region center assoc;
run;
That will take care of missing values for the three key variables. Of course there can be missing values for some of the metrics.
*********** EDITED:
If you don't have four SAS data sets (or if MERGE makes you uncomfortable), you can probably revise your SELECT statement along these lines:
select coalesce(t1.region, t2.region, t3.region, t4.region) as region,
... same idea for the other two joining variables.
if i understand the request correct, here is what i would do:
i would create a list of distinct values for Region | Center | Association by something like this
proc sql noprint;
create table distinctvalues as
SELECT DISTINCT Region, Center, Association
FROM
(
SELECT Region, Center, Association FROM Table1 UNION ALL
SELECT Region, Center, Association FROM Table2 UNION ALL
SELECT Region, Center, Association FROM Table3 UNION ALL
SELECT Region, Center, Association FROM Table4
) a
quit;
i would then left join the 4 tables to the DistinctValues tables
Or use the distinct list to filter your master list.
proc sql;
create table want as
select m.*
from master as m
left join distinctValues as d
on m.region =d.region
and m.center=d.center
and m.association = d.association;
quit;
probably @Astounding is the way to go. but in sql you can do this as
proc sql;
select region,
center ,
assoc,
max(metric1) as metric1,
max(Descriptor1) as Descriptor1,
max(metric2) as metric2,
max(Descriptor2) as Descriptor2,
max(metric3) as metric3,
max(Descriptor3) as Descriptor3,
max(metric4) as metric4,
max(Descriptor1) as Descriptor4
from
(select coalesce(a.Region,b.region,c.region,d.region) as region ,
coalesce(a.Center,b.center,c.center,d.center) as center,
coalesce(a.Assoc, b.assoc, c.assoc,d.assoc) as assoc ,
Metric1 ,
Descriptor1,
Metric2 ,
Descriptor2,
Metric3 ,
Descriptor3,
Metric4 ,
Descriptor4
from table1 a
full join
table2 b
on a.Region=b.region
and a.Center=b.center
and a.Assoc= b.assoc
full join
table3 c
on a.Region=c.region
and a.Center=c.center
and a.Assoc= c.assoc
full join
table4 d
on a.Region=d.region
and a.Center=d.center
and a.Assoc= d.assoc)a
group by region, center, assoc
;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.