BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Agent1592
Pyrite | Level 9

I have two datasets have1 and have2 and I would like to merge them by company name. I walso would like an additional restriction that the year of the first table has to be withing the start and end date of  have2. The leading dataset is have1. I tried the spedis function but I get a warning about a Cartesian product and the resulting dataset is humongous. I can't stop the code. What is the best way to match? I know there is compged, soundex and so on but what is best in this case? Is there a way to match using prxchange (the Perl syntax). Could you please help.

 

 

proc sql;
create want
as select a.*,b.*
from have1 as a, have2 as b
group by a.CNAME (a.year>=b.year(start)) and (a.year<=b.year(end))
having spedis(a.CNAME,b.COMNAM)=min(spedis(a.CNAME,b.COMNAM));
quit;

 

 

data WORK.HAVE1;
  infile datalines dsd truncover;
  input year:BEST12. CNAME:$20.;
  format year BEST12.;
datalines4;
1992,AGILENT TECHNOLOGIES
1993,AGILENT TECHNOLOGIES
1994,AGILENT TECHNOLOGIES
1995,AGILENT TECHNOLOGIES
1996,AGILENT TECHNOLOGIES
1997,AGILENT TECHNOLOGIES
1998,AGILENT TECHNOLOGIES
1999,AGILENT TECHNOLOGIES
2001,AGILENT TECHNOLOGIES
2003,AGILENT TECHNOLOGIES
2004,AGILENT TECHNOLOGIES
2015,AGILENT TECHNOLOGIES
1999,AGILENT TECHS
2001,AGILENT TECHS
1998,ALCOA
2003,ALCOA
2004,ALCOA
2005,ALCOA
2006,ALCOA
2007,ALCOA
2009,ALCOA
2010,ALCOA
2011,ALCOA
2012,ALCOA
2014,ALCOA
2015,ALCOA
1996,AAON
1987,APPLE COMPUTER
1988,APPLE COMPUTER
1989,APPLE COMPUTER
1990,APPLE COMPUTER
1991,APPLE COMPUTER
1992,APPLE COMPUTER
1996,THOMAS &  BETTS
1997,THOMAS &  BETTS
1998,THOMAS &  BETTS
1999,THOMAS &  BETTS
2001,THOMAS &  BETTS
2003,THOMAS &  BETTS
1987,THOMAS & BETTS
1988,THOMAS & BETTS
1989,THOMAS & BETTS
1990,THOMAS & BETTS
1991,THOMAS & BETTS
1992,THOMAS & BETTS
1993,THOMAS & BETTS
1994,THOMAS & BETTS
1995,THOMAS & BETTS
1996,THOMAS & BETTS
1997,THOMAS & BETTS
1998,THOMAS & BETTS
1999,THOMAS & BETTS
2001,THOMAS & BETTS
2003,THOMAS & BETTS
2004,THOMAS & BETTS
2005,THOMAS & BETTS
2006,THOMAS & BETTS
2007,THOMAS & BETTS
2009,THOMAS & BETTS
2010,THOMAS & BETTS
2011,THOMAS & BETTS
2012,THOMAS & BETTS
2014,THOMAS & BETTS
2015,THOMAS & BETTS
1996,TYCO ELECTRONICS
1997,TYCO ELECTRONICS
1998,TYCO ELECTRONICS
1999,TYCO ELECTRONICS
2014,ABBVIE
2015,ABBVIE
1987,AMERICAN BUILDINGS
1988,AMERICAN BUILDINGS
1989,AMERICAN BUILDINGS
1990,AMERICAN BUILDINGS
1987,ABENGOA BIOENERGY
1988,ABENGOA BIOENERGY
1989,ABENGOA BIOENERGY
1990,ABENGOA BIOENERGY
1991,ABENGOA BIOENERGY
1992,ABENGOA BIOENERGY
1993,ABENGOA BIOENERGY
1994,ABENGOA BIOENERGY
1995,ABENGOA BIOENERGY
1996,ABENGOA BIOENERGY
1997,ABENGOA BIOENERGY
1998,ABENGOA BIOENERGY
1999,ABENGOA BIOENERGY
2001,ABENGOA BIOENERGY
2003,ABENGOA BIOENERGY
2004,ABENGOA BIOENERGY
2005,ABENGOA BIOENERGY
2006,ABENGOA BIOENERGY
2007,ABENGOA BIOENERGY
2009,ABENGOA BIOENERGY
2010,ABENGOA BIOENERGY
2011,ABENGOA BIOENERGY
2014,ABENGOA BIOENERGY
1992,APPLERA
1993,APPLERA
1994,APPLERA
1995,APPLERA
1996,APPLERA
1997,APPLERA
1999,APPLERA
2001,APPLERA
2003,APPLERA
1987,AMERICAN BILTRITE
1988,AMERICAN BILTRITE
1989,AMERICAN BILTRITE
1990,AMERICAN BILTRITE
1991,AMERICAN BILTRITE
1992,AMERICAN BILTRITE
1993,AMERICAN BILTRITE
1994,AMERICAN BILTRITE
1995,AMERICAN BILTRITE
1996,AMERICAN BILTRITE
1997,AMERICAN BILTRITE
1998,AMERICAN BILTRITE
1999,AMERICAN BILTRITE
2001,AMERICAN BILTRITE
2003,AMERICAN BILTRITE
2004,AMERICAN BILTRITE
2005,AMERICAN BILTRITE
2006,AMERICAN BILTRITE
2007,AMERICAN BILTRITE
2009,AMERICAN BILTRITE
2010,AMERICAN BILTRITE
2011,AMERICAN BILTRITE
2012,AMERICAN BILTRITE
2014,AMERICAN BILTRITE
2015,AMERICAN BILTRITE
1989,ABBOTT LABORATORIES
1990,ABBOTT LABORATORIES
1991,ABBOTT LABORATORIES
1992,ABBOTT LABORATORIES
1993,ABBOTT LABORATORIES
1994,ABBOTT LABORATORIES
1995,ABBOTT LABORATORIES
1996,ABBOTT LABORATORIES
1997,ABBOTT LABORATORIES
1998,ABBOTT LABORATORIES
1999,ABBOTT LABORATORIES
2004,ABBOTT LABORATORIES
2005,ABBOTT LABORATORIES
2006,ABBOTT LABORATORIES
2007,ABBOTT LABORATORIES
2009,ABBOTT LABORATORIES
2010,ABBOTT LABORATORIES
2011,ABBOTT LABORATORIES
2012,ABBOTT LABORATORIES
2014,ABBOTT LABORATORIES
2015,ABBOTT LABORATORIES
1987,ABBOTT LABS
1988,ABBOTT LABS
1989,ABBOTT LABS
1990,ABBOTT LABS
1991,ABBOTT LABS
1992,ABBOTT LABS
1993,ABBOTT LABS
1994,ABBOTT LABS
1995,ABBOTT LABS
1996,ABBOTT LABS
1997,ABBOTT LABS
1998,ABBOTT LABS
1999,ABBOTT LABS
2001,ABBOTT LABS
1987,ACETO
1988,ACETO
1989,ACETO
1990,ACETO
1987,AMCOL INTERNATIONAL
1990,AMCOL INTERNATIONAL
1991,AMCOL INTERNATIONAL
1992,AMCOL INTERNATIONAL
1993,AMCOL INTERNATIONAL
1994,AMCOL INTERNATIONAL
1995,AMCOL INTERNATIONAL
1996,AMCOL INTERNATIONAL
1997,AMCOL INTERNATIONAL
1998,AMCOL INTERNATIONAL
1999,AMCOL INTERNATIONAL
2001,AMCOL INTERNATIONAL
2003,AMCOL INTERNATIONAL
2004,AMCOL INTERNATIONAL
2005,AMCOL INTERNATIONAL
2006,AMCOL INTERNATIONAL
2007,AMCOL INTERNATIONAL
2009,AMCOL INTERNATIONAL
2010,AMCOL INTERNATIONAL
2011,AMCOL INTERNATIONAL
2012,AMCOL INTERNATIONAL
1987,ACME UNITED
1988,ACME UNITED
1989,ACME UNITED
1990,ACME UNITED
1991,ACME UNITED
1992,ACME UNITED
1993,ACME UNITED
1994,ACME UNITED
1991,HALSEY DRUG
1992,HALSEY DRUG
1993,HALSEY DRUG
1994,HALSEY DRUG
1995,HALSEY DRUG
1996,HALSEY DRUG
1997,HALSEY DRUG
1998,HALSEY DRUG
1999,HALSEY DRUG
2001,HALSEY DRUG
2003,HALSEY DRUG
;;;;
data WORK.HAVE2;
  infile datalines dsd truncover;
  input company_id:BEST12. ticker:$4. COMNAM:$25. start:$9. end:$9.;
  format company_id BEST12.;
datalines4;
87432,A,AGILENT TECHNOLOGIES,18-Nov-99,28-Sep-18
16347,AA,ALCOA,1-Nov-16,28-Sep-18
14593,AAPL,APPLE COMPUTER,12-Dec-80,28-Sep-18
38578,TNB,THOMAS & BETTS,27-Sep-62,16-May-12
13721,ABBV,ABBVIE,2-Jan-13,28-Sep-18
80431,ABCO,AMERICAN BUILDINGS CO NEW,29-Apr-94,12-May-99
14236,ABGB,ABENGOA S A,17-Oct-13,27-Apr-16
27713,ABI,APPLERA,30-Nov-00,21-Nov-08
39482,ABL,AMERICAN BILTRITE RUBR,1-Aug-63,26-Jan-12
20482,ABT,ABBOTT LABORATORIES,2-Jul-62,28-Sep-18
10656,ACET,ACETO CHEMICAL,14-Dec-72,28-Sep-18
11599,ACO,AMCOL INTERNATIONAL,22-Sep-98,9-May-14
60038,ACU,ACME UNITED,3-Oct-77,28-Sep-18
10116,HADR,HALSEY DRUG,14-Mar-86,29-Nov-88
;;;;

 

1 ACCEPTED SOLUTION

Accepted Solutions
rudfaden
Pyrite | Level 9

 

 

 

 

 

First of all, you should import your date as a date. Else the date function year will not work.

 infile datalines dsd truncover;
  input company_id:BEST12. ticker:$4. COMNAM:$25. start:date10. end:date10.;
  format company_id BEST12. start datetime9. end datetime9.;

Second, there is a lot of syntax errors in your sql statement. 

 

This works

 

proc sql;
create table want
as select a.*,b.*
from have1 as a, have2 as b
where (a.year>=year(b.start)) and (a.year<=year(b.end))
group by a.CNAME 
having spedis(a.CNAME,b.COMNAM)=min(spedis(a.CNAME,b.COMNAM));
quit;

View solution in original post

3 REPLIES 3
rudfaden
Pyrite | Level 9

 

 

 

 

 

First of all, you should import your date as a date. Else the date function year will not work.

 infile datalines dsd truncover;
  input company_id:BEST12. ticker:$4. COMNAM:$25. start:date10. end:date10.;
  format company_id BEST12. start datetime9. end datetime9.;

Second, there is a lot of syntax errors in your sql statement. 

 

This works

 

proc sql;
create table want
as select a.*,b.*
from have1 as a, have2 as b
where (a.year>=year(b.start)) and (a.year<=year(b.end))
group by a.CNAME 
having spedis(a.CNAME,b.COMNAM)=min(spedis(a.CNAME,b.COMNAM));
quit;
Agent1592
Pyrite | Level 9

Thanks. This works but the code is running very slow. I have around 15,000 entries in have1 and 40,000 entries in have2. What is a good solution to optimize the process?

rudfaden
Pyrite | Level 9

Not really sure what you are trying to archive. But you might look into using a proper sql join or limit your number of variables. 

 

I also think the spedis function is a little heavy run. 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 2540 views
  • 1 like
  • 2 in conversation