BookmarkSubscribeRSS Feed
EIrvin
Fluorite | Level 6

I have 4 different tables with a certain metric in each table, and 3 columns in each table, that I am attempting to join on. 

 

the data is as follows:

 

table 1

Region | Center | Assoc | Metric 1 | Descriptor 1

a            1            1           z            

c            3            3           z

d            4            4           z

 

table 2

Region | Center | Assoc | Metric 2 | Descriptor 2

a            1            1            t

b            2            2            t

c            3            3            t

 

table 3

Region | Center | Assoc | Metric 3 | Descriptor 3

a            1            1            y

b            2            2            y

c            3            3            y

d            4            4            y

 

table 4

Region | Center | Assoc | Metric 4 | Descriptor 4

a             1            1           x

d             4            4           x

 

 

the result that I am looking for would have all regions, centers, and associates listed whether or not they are in 1 or all of the tables.  and all the metrics are filled in, even if its missing.  The important part is that each region, center, and associate that is in any of the 4 tables be shown in the resulting table.

 

here's what I'm looking for:

Region | Center | Assoc | Metric 1 | Descriptor 1 | Metric 2 | Descriptor 2 | Metric 3 | Descriptor 3 | Metric 4 | Descriptor 4

a             1           1           z                                     t                                     y                                    x

b             2           2           .                                      t                                     y                                    .

c             3           3           z                                     t                                     y                                    .

d             4           4           z                                     .                                     y                                    x

 

I have tried a full join as seen below, and

FROM WORK.BC_ACTIVATION_OFFICERS t1
           FULL OUTER JOIN WORK.BC_CLOSEDIN90_OFFICERS t2 ON (t1.'Region No'n = t2.'Region No'n) AND (t1.'Center No'n = t2.'Center No'n) AND (t1.Officer = t2.Officer)
           FULL OUTER JOIN WORK.CLOSED_W_NEW_WITHIN_30D_OFFICERS t3 ON (t1.'Region No'n = t3.'Region No'n) AND (t1.'Center No'n = t3.'Center No'n) AND (t1.Officer = t3.'New Acct Officer'n)
		  FULL OUTER JOIN WORK.BC_APPS_BY_OFFICER t4 ON (t1.'Region No'n = t4.'Region No'n) AND (t1.'Center No'n = t4.'Center No'n) AND (t1.Officer = t4.Officer)

and I get missing regions and centers and officers.  I have checked the tables individually, and there are no missing regions, centers or associates in any of them

 

 

Ideas?

 

Thanks,

El

11 REPLIES 11
Reeza
Super User

Can you show the expected output you would want from that sample input?

 

If all the files have the same variables, usually the data is stacked, not merged, so to understand your desired outcome would help.

 

Spoiler

@EIrvin wrote:

I have 4 different tables with a certain metric in each table, and 3 columns in each table, that I am attempting to join on. 

 

the data is as follows:

 

table 1

Region | Center | Assoc | Metric 1 | Descriptor 1

a            1            1           z            

c            3            3           z

d            4            4           z

 

table 2

Region | Center | Assoc | Metric 2 | Descriptor 2

a            1            1            t

b            2            2            t

c            3            3            t

 

table 3

Region | Center | Assoc | Metric 3 | Descriptor 3

a            1            1            y

b            2            2            y

c            3            3            y

d            4            4            y

 

table 4

Region | Center | Assoc | Metric 4 | Descriptor 4

a             1            1           x

d             4            4           x

 

 

the result that I am looking for would have all regions, centers, and associates listed whether or not they are in 1 or all of the tables.  and all the metrics are filled in, even if its missing.  The important part is that each region, center, and associate that is in any of the 4 tables be shown in the resulting table.

 

here's what I'm looking for:

Region | Center | Assoc | Metric 1 | Descriptor 1 | Metric 2 | Descriptor 2 | Metric 3 | Descriptor 3 | Metric 4 | Descriptor 4

a             1           1           z                                     t                                     y                                    x

b             2           2           .                                      t                                     y                                    .

c             3           3           z                                     t                                     y                                    .

d             4           4           z                                     .                                     y                                    x

 

I have tried a full join as seen below, and

FROM WORK.BC_ACTIVATION_OFFICERS t1
           FULL OUTER JOIN WORK.BC_CLOSEDIN90_OFFICERS t2 ON (t1.'Region No'n = t2.'Region No'n) AND (t1.'Center No'n = t2.'Center No'n) AND (t1.Officer = t2.Officer)
           FULL OUTER JOIN WORK.CLOSED_W_NEW_WITHIN_30D_OFFICERS t3 ON (t1.'Region No'n = t3.'Region No'n) AND (t1.'Center No'n = t3.'Center No'n) AND (t1.Officer = t3.'New Acct Officer'n)
		  FULL OUTER JOIN WORK.BC_APPS_BY_OFFICER t4 ON (t1.'Region No'n = t4.'Region No'n) AND (t1.'Center No'n = t4.'Center No'n) AND (t1.Officer = t4.Officer)

and I get missing regions and centers and officers.  I have checked the tables individually, and there are no missing regions, centers or associates in any of them

 

 

Ideas?

 

Thanks,

El


EIrvin
Fluorite | Level 6

I was in the process of editing, just that, when you responded.  It should be out there now

 

Reeza
Super User

 and all the metrics are filled in, even if its missing

How does that work when there's no data in the tables?

 

It looks like you're trying to recreate a data from a star schema structure/fact+dim tables. Is that a correct assumption?

EIrvin
Fluorite | Level 6

I'm not sure I know what a star schema is.  but because the metrics were so acutely focused, I had to create tables just for those metrics, however, my audience wants to see them all rolled up together.  But I can't go to a mast list of all associates and centers and regions, because that master list would be too large. so I was attempting to capture those regions, centers, and associates that appeared in any one of those tables, and show all metrics in a single line for each of the region/center/associates ...  if that makes any sense

Reeza
Super User
You do have a master table though, with all the information you need? If so, it may be easier to subset from that table than do the re-merges, based on the keys in these tables.
Astounding
PROC Star

You don't show which fields you are selecting, but presumably the key fields (Region, Center, Assoc) are selected from just one of the tables.  However ...

 

Remember, a full outer join takes records from both tables.  There can be mismatches.  If you are selecting Region, Center, and Assoc from the first table, it is possible that some Regions appear in the second table only.  Those records get selected by a full outer join, but there is no Region value found in the first table.

 

Are you able to use MERGE instead of PROC SQL?

EIrvin
Fluorite | Level 6

SQL is the only language I know.  how does merge work in these situations?

Astounding
PROC Star

Assuming you have four SAS data sets, sort each of them:

 

by region center assoc;

 

Then combine them:

 

data want;

merge table1 table2 table3 table4;

by region center assoc;

run;

 

That will take care of missing values for the three key variables.  Of course there can be missing values for some of the metrics.

 

*********** EDITED:

 

If you don't have four SAS data sets (or if MERGE makes you uncomfortable), you can probably revise your SELECT statement along these lines:

 

select coalesce(t1.region, t2.region, t3.region, t4.region) as region,

 

... same idea for the other two joining variables.

utrocketeng
Quartz | Level 8

if i understand the request correct, here is what i would do:

i would create a list of distinct values for Region | Center | Association by something like this 

proc sql noprint;

create table distinctvalues as

SELECT DISTINCT Region, Center, Association

FROM

SELECT Region, Center, Association FROM Table1 UNION ALL

SELECT Region, Center, Association FROM Table2 UNION ALL

SELECT Region, Center, Association FROM Table3 UNION ALL

SELECT Region, Center, Association FROM Table4

) a

quit;

 

i would then left join the 4 tables to the DistinctValues tables

Reeza
Super User

Or use the distinct list to filter your master list. 

 

proc sql;
create table want as
select m.* 
from master as m
left join distinctValues as d
on m.region =d.region
and m.center=d.center
and m.association = d.association;
quit;
kiranv_
Rhodochrosite | Level 12

probably @Astounding is the way to go. but in sql you can do this as 

proc sql;
select region, 
     center , 
     assoc, 
     max(metric1) as metric1, 
     max(Descriptor1) as Descriptor1,
     max(metric2) as metric2,
     max(Descriptor2) as Descriptor2,
     max(metric3) as  metric3,
     max(Descriptor3) as Descriptor3,
      max(metric4) as  metric4,
     max(Descriptor1) as Descriptor4
     from 
          (select coalesce(a.Region,b.region,c.region,d.region) as region ,
                   coalesce(a.Center,b.center,c.center,d.center) as center, 
                   coalesce(a.Assoc, b.assoc, c.assoc,d.assoc) as assoc ,
                   Metric1  ,
                   Descriptor1,
                   Metric2  ,
                 Descriptor2,
                 Metric3  ,
                 Descriptor3,
                 Metric4  ,
                Descriptor4
         from table1 a
         full join
        table2 b
       on a.Region=b.region 
       and a.Center=b.center
       and  a.Assoc= b.assoc
      full join 
     table3 c
     on a.Region=c.region 
     and      a.Center=c.center
     and      a.Assoc= c.assoc
   full join 
    table4 d
    on a.Region=d.region 
   and      a.Center=d.center
    and      a.Assoc= d.assoc)a
   group by region, center, assoc

 ;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1364 views
  • 4 likes
  • 5 in conversation