Help using Base SAS procedures

Proc SQL & Cost Function in one step

Accepted Solution Solved
Reply
Contributor
Posts: 55
Accepted Solution

Proc SQL & Cost Function in one step

Hi Everyone,

I have two datasets that I want to merge. The only common variable between them is a string, which sometimes have slight differences. For example, dataset#1 may have ABC Technology Inc, but dataset#2 may have ABC Tech. Inc. I know which proc sql commands to use, but I was wondering if there is way to embed step#2 into step#1 below. The issue that I am facing is that the first step generates too large of dataset such that my computer runs out of harddisk space even before the step is complete (first dataset has 600,000 and the second dataset has 40,000 observations).

step #1:

proc sql;

     create table want as

              select *

     from set1, set2;

quit;

step #2:

data want2; set want;

cost=compged(name1,name2, 'L');

if cost le 1000;

run;

Thank you for your help.


Accepted Solutions
Solution
‎05-17-2015 11:06 PM
Grand Advisor
Posts: 16,933

Re: Proc SQL & Cost Function in one step

Assuming you have no variables names that overlap between the two datasets and would actually like variable COST in your output data set

proc sql;

     create table want as

              select *, compged(a.name1,b.name2,'L')  as cost

              from set1

              cross join set2

             where calculated cost<=1000;

quit;

View solution in original post


All Replies
Respected Advisor
Posts: 4,606

Re: Proc SQL & Cost Function in one step

Something like this?

proc sql;

create table want as

select *

from

     set1 inner join

     set2 on compged(set1.name, set2.name, 'L') <= 1000;

quit;

PG

PG
Super Contributor
Posts: 272

Re: Proc SQL & Cost Function in one step

You could try this:

proc sql;

   create table want as

   select a.*,b.* from dataset1 a,dataset2 b

   where compged(a.name1,b.name2,'L')<=1000;

quit;

Solution
‎05-17-2015 11:06 PM
Grand Advisor
Posts: 16,933

Re: Proc SQL & Cost Function in one step

Assuming you have no variables names that overlap between the two datasets and would actually like variable COST in your output data set

proc sql;

     create table want as

              select *, compged(a.name1,b.name2,'L')  as cost

              from set1

              cross join set2

             where calculated cost<=1000;

quit;

Grand Advisor
Posts: 9,466

Re: Proc SQL & Cost Function in one step

Another useful operator is "Sound Like" . Or you could make a table to uniform them into the same value .

proc sql;

     create table want as

              select *

     from set1, set2

      where   name1 =* name2 ;

quit;

Xia Keshan

Contributor
Posts: 55

Re: Proc SQL & Cost Function in one step

Thank you all for your help.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 249 views
  • 7 likes
  • 5 in conversation