BookmarkSubscribeRSS Feed
Peter_Y
Calcite | Level 5

Hello:

  I am trying to merge 2 tables in sql by two variables subject and centre. Here is what I did

proc sql;

create table master_table as

select d1.*, d2.* from d1 and d2 where d1.subject = d2.subject and d1.centre = d2.centre;

quit;

SAS prints a warning message in the log saying that "subject" and "centre" already exist in master_table. I understand the message is due to the common variable names in both datasets. One way to avoid the warning msg is to explicitly spell out all variable names in d1. However, this is really tedious especially when I have hundreds of variables. Is there anyway to say "select all columns other than subject and centre from d1" so the warning is suppressed?

Thanks,

Peter

4 REPLIES 4
Amir
PROC Star

Hi,

Before running the code create a macro variable with the column names you want, e.g. (untested):

proc sql noprint;

  select name into :d1columns separated by ','

  from dictionary.columns

  where libname='WORK'

    and memname='D1'

    and upcase(name) not in ('SUBJECT','CENTRE')

  ;

quit;

Then you can use the macro variable in your code:

proc sql;

create table master_table as

select &d1columns, d2.* from d1 and d2 where d1.subject = d2.subject and d1.centre = d2.centre;

quit;

Regards,

Amir.

Message was edited by: Amir Malik - fixed memname

Reeza
Super User

You can try the feedback option to get all your columns listed, rather than having to type them out.

The select statement will be expanded in the log.

496  proc sql feedback;

497      create table want as

498      select *

499      from sashelp.class as A;

NOTE: Statement transforms to:

        select A.Name, A.Sex, A.Age, A.Height, A.Weight

          from SASHELP.CLASS A;

NOTE: Table WORK.WANT created, with 19 rows and 5 columns.

500  quit;

NOTE: PROCEDURE SQL used (Total process time):

      real time           0.01 seconds

      cpu time            0.01 seconds

I also think you should modify your code to read as follows, as this prevents a full Cartesian join. I'd also explicitly specify my join type (inner, left, right, outer)

proc sql;

create table master_table as

select d1.*, d2.*

  from d1

   join d2

on d1.subject = d2.subject and d1.centre = d2.centre;

quit;

SASKiwi
PROC Star

You can also do a rename (and optionally drop the duplicate columns) to avoid the message:

proc sql;

create table master_table (drop = subject2 centre2)  as

select *

from d1, d2 (rename = (subject = subject2 centre = centre2))

where d1.subject = d2.subject2 and d1.centre = d2.centre2;

quit;


Tom
Super User Tom
Super User

You could try using NATURAL JOIN.   It will have issues if there are other common variables, but then you would have received warnings about those with your current code.

proc sql;

create table master_table as

  select d1.*, d2.*

  from d1 natural join d2

;

quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2959 views
  • 1 like
  • 5 in conversation