Hi Everyone,
I have two datasets that I want to merge. The only common variable between them is a string, which sometimes have slight differences. For example, dataset#1 may have ABC Technology Inc, but dataset#2 may have ABC Tech. Inc. I know which proc sql commands to use, but I was wondering if there is way to embed step#2 into step#1 below. The issue that I am facing is that the first step generates too large of dataset such that my computer runs out of harddisk space even before the step is complete (first dataset has 600,000 and the second dataset has 40,000 observations).
step #1:
proc sql;
create table want as
select *
from set1, set2;
quit;
step #2:
data want2; set want;
cost=compged(name1,name2, 'L');
if cost le 1000;
run;
Thank you for your help.
Assuming you have no variables names that overlap between the two datasets and would actually like variable COST in your output data set
proc sql;
create table want as
select *, compged(a.name1,b.name2,'L') as cost
from set1
cross join set2
where calculated cost<=1000;
quit;
Something like this?
proc sql;
create table want as
select *
from
set1 inner join
set2 on compged(set1.name, set2.name, 'L') <= 1000;
quit;
PG
You could try this:
proc sql;
create table want as
select a.*,b.* from dataset1 a,dataset2 b
where compged(a.name1,b.name2,'L')<=1000;
quit;
Assuming you have no variables names that overlap between the two datasets and would actually like variable COST in your output data set
proc sql;
create table want as
select *, compged(a.name1,b.name2,'L') as cost
from set1
cross join set2
where calculated cost<=1000;
quit;
Another useful operator is "Sound Like" . Or you could make a table to uniform them into the same value .
proc sql;
create table want as
select *
from set1, set2
where name1 =* name2 ;
quit;
Xia Keshan
Thank you all for your help.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.