BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
finans_sas
Quartz | Level 8

Hi Everyone,

I have two datasets that I want to merge. The only common variable between them is a string, which sometimes have slight differences. For example, dataset#1 may have ABC Technology Inc, but dataset#2 may have ABC Tech. Inc. I know which proc sql commands to use, but I was wondering if there is way to embed step#2 into step#1 below. The issue that I am facing is that the first step generates too large of dataset such that my computer runs out of harddisk space even before the step is complete (first dataset has 600,000 and the second dataset has 40,000 observations).

step #1:

proc sql;

     create table want as

              select *

     from set1, set2;

quit;

step #2:

data want2; set want;

cost=compged(name1,name2, 'L');

if cost le 1000;

run;

Thank you for your help.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Assuming you have no variables names that overlap between the two datasets and would actually like variable COST in your output data set

proc sql;

     create table want as

              select *, compged(a.name1,b.name2,'L')  as cost

              from set1

              cross join set2

             where calculated cost<=1000;

quit;

View solution in original post

5 REPLIES 5
PGStats
Opal | Level 21

Something like this?

proc sql;

create table want as

select *

from

     set1 inner join

     set2 on compged(set1.name, set2.name, 'L') <= 1000;

quit;

PG

PG
slchen
Lapis Lazuli | Level 10

You could try this:

proc sql;

   create table want as

   select a.*,b.* from dataset1 a,dataset2 b

   where compged(a.name1,b.name2,'L')<=1000;

quit;

Reeza
Super User

Assuming you have no variables names that overlap between the two datasets and would actually like variable COST in your output data set

proc sql;

     create table want as

              select *, compged(a.name1,b.name2,'L')  as cost

              from set1

              cross join set2

             where calculated cost<=1000;

quit;

Ksharp
Super User

Another useful operator is "Sound Like" . Or you could make a table to uniform them into the same value .

proc sql;

     create table want as

              select *

     from set1, set2

      where   name1 =* name2 ;

quit;

Xia Keshan

finans_sas
Quartz | Level 8

Thank you all for your help.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 900 views
  • 7 likes
  • 5 in conversation