Please help automate this process w/o hard coding.

Reply
N/A
Posts: 0

Please help automate this process w/o hard coding.

data diab.medemerge_miss2;
set diab.medemerge_miss;

Last = substr(Last_Name,1,3);
First = substr(First_Name,1,1);
run;

proc sql ;
select Last into :last1 - :last131 from diab.medemerge_miss2;
quit;

proc sql ;
select First into :First1 - :First131 from diab.medemerge_miss2;
quit;

proc sql ;
select DOB into Smiley Very HappyOB1 - Smiley Very HappyOB131 from diab.medemerge_miss2;
quit;

data diab.medemerge3;
set diab.member_extract;
where
(index (FIRST_Name,"&First1") > 0 and index (LAST_Name,"&Last1") > 0 and DOB = "&DOB1"d)
or (index (FIRST_Name,"&First2") > 0 and index (LAST_Name,"&Last2") > 0 and DOB = "&DOB2"d)
or (index (FIRST_Name,"&First3") > 0 and index (LAST_Name,"&Last3") > 0 and DOB = "&DOB3"d)
or (index (FIRST_Name,"&First4") > 0 and index (LAST_Name,"&Last4") > 0 and DOB = "&DOB4"d)
or (index (FIRST_Name,"&First5") > 0 and index (LAST_Name,"&Last5") > 0 and DOB = "&DOB5"d)
or (index (FIRST_Name,"&First6") > 0 and index (LAST_Name,"&Last6") > 0 and DOB = "&DOB6"d)
or (index (FIRST_Name,"&First7") > 0 and index (LAST_Name,"&Last7") > 0 and DOB = "&DOB7"d)
or (index (FIRST_Name,"&First8") > 0 and index (LAST_Name,"&Last8") > 0 and DOB = "&DOB8"d)
or (index (FIRST_Name,"&First9") > 0 and index (LAST_Name,"&Last9") > 0 and DOB = "&DOB9"d)
or (index (FIRST_Name,"&First10") > 0 and index (LAST_Name,"&Last10") > 0 and DOB = "&DOB10"d)
or (index (FIRST_Name,"&First11") > 0 and index (LAST_Name,"&Last11") > 0 and DOB = "&DOB11"d)
or (index (FIRST_Name,"&First12") > 0 and index (LAST_Name,"&Last12") > 0 and DOB = "&DOB12"d)
or (index (FIRST_Name,"&First13") > 0 and index (LAST_Name,"&Last13") > 0 and DOB = "&DOB13"d)
or (index (FIRST_Name,"&First14") > 0 and index (LAST_Name,"&Last14") > 0 and DOB = "&DOB14"d)
or (index (FIRST_Name,"&First15") > 0 and index (LAST_Name,"&Last15") > 0 and DOB = "&DOB15"d)
or (index (FIRST_Name,"&First16") > 0 and index (LAST_Name,"&Last16") > 0 and DOB = "&DOB16"d)
or (index (FIRST_Name,"&First17") > 0 and index (LAST_Name,"&Last17") > 0 and DOB = "&DOB17"d)
or (index (FIRST_Name,"&First18") > 0 and index (LAST_Name,"&Last18") > 0 and DOB = "&DOB18"d)
or (index (FIRST_Name,"&First19") > 0 and index (LAST_Name,"&Last19") > 0 and DOB = "&DOB19"d)
or (index (FIRST_Name,"&First20") > 0 and index (LAST_Name,"&Last20") > 0 and DOB = "&DOB20"d)
or (index (FIRST_Name,"&First21") > 0 and index (LAST_Name,"&Last21") > 0 and DOB = "&DOB21"d)
or (index (FIRST_Name,"&First22") > 0 and index (LAST_Name,"&Last22") > 0 and DOB = "&DOB22"d)
or (index (FIRST_Name,"&First23") > 0 and index (LAST_Name,"&Last23") > 0 and DOB = "&DOB23"d)
or (index (FIRST_Name,"&First24") > 0 and index (LAST_Name,"&Last24") > 0 and DOB = "&DOB24"d)
or (index (FIRST_Name,"&First25") > 0 and index (LAST_Name,"&Last25") > 0 and DOB = "&DOB25"d)
or (index (FIRST_Name,"&First26") > 0 and index (LAST_Name,"&Last26") > 0 and DOB = "&DOB26"d)
or (index (FIRST_Name,"&First27") > 0 and index (LAST_Name,"&Last27") > 0 and DOB = "&DOB27"d)
or (index (FIRST_Name,"&First28") > 0 and index (LAST_Name,"&Last28") > 0 and DOB = "&DOB28"d)
or (index (FIRST_Name,"&First29") > 0 and index (LAST_Name,"&Last29") > 0 and DOB = "&DOB29"d)
or (index (FIRST_Name,"&First30") > 0 and index (LAST_Name,"&Last30") > 0 and DOB = "&DOB30"d)
or (index (FIRST_Name,"&First31") > 0 and index (LAST_Name,"&Last31") > 0 and DOB = "&DOB31"d)
or (index (FIRST_Name,"&First32") > 0 and index (LAST_Name,"&Last32") > 0 and DOB = "&DOB32"d)
or (index (FIRST_Name,"&First33") > 0 and index (LAST_Name,"&Last33") > 0 and DOB = "&DOB33"d)
or (index (FIRST_Name,"&First34") > 0 and index (LAST_Name,"&Last34") > 0 and DOB = "&DOB34"d)
or (index (FIRST_Name,"&First35") > 0 and index (LAST_Name,"&Last35") > 0 and DOB = "&DOB35"d)
or (index (FIRST_Name,"&First36") > 0 and index (LAST_Name,"&Last36") > 0 and DOB = "&DOB36"d)
or (index (FIRST_Name,"&First37") > 0 and index (LAST_Name,"&Last37") > 0 and DOB = "&DOB37"d)
or (index (FIRST_Name,"&First38") > 0 and index (LAST_Name,"&Last38") > 0 and DOB = "&DOB38"d)
or (index (FIRST_Name,"&First39") > 0 and index (LAST_Name,"&Last39") > 0 and DOB = "&DOB39"d)
or (index (FIRST_Name,"&First40") > 0 and index (LAST_Name,"&Last40") > 0 and DOB = "&DOB40"d)
or (index (FIRST_Name,"&First41") > 0 and index (LAST_Name,"&Last41") > 0 and DOB = "&DOB41"d)
or (index (FIRST_Name,"&First42") > 0 and index (LAST_Name,"&Last42") > 0 and DOB = "&DOB42"d)
or (index (FIRST_Name,"&First43") > 0 and index (LAST_Name,"&Last43") > 0 and DOB = "&DOB43"d)
or (index (FIRST_Name,"&First44") > 0 and index (LAST_Name,"&Last44") > 0 and DOB = "&DOB44"d)
or (index (FIRST_Name,"&First45") > 0 and index (LAST_Name,"&Last45") > 0 and DOB = "&DOB45"d)
or (index (FIRST_Name,"&First46") > 0 and index (LAST_Name,"&Last46") > 0 and DOB = "&DOB46"d)
or (index (FIRST_Name,"&First47") > 0 and index (LAST_Name,"&Last47") > 0 and DOB = "&DOB47"d)
or (index (FIRST_Name,"&First48") > 0 and index (LAST_Name,"&Last48") > 0 and DOB = "&DOB48"d)
or (index (FIRST_Name,"&First49") > 0 and index (LAST_Name,"&Last49") > 0 and DOB = "&DOB49"d)
or (index (FIRST_Name,"&First50") > 0 and index (LAST_Name,"&Last50") > 0 and DOB = "&DOB50"d)
or (index (FIRST_Name,"&First51") > 0 and index (LAST_Name,"&Last51") > 0 and DOB = "&DOB51"d)
or (index (FIRST_Name,"&First52") > 0 and index (LAST_Name,"&Last52") > 0 and DOB = "&DOB52"d)
or (index (FIRST_Name,"&First53") > 0 and index (LAST_Name,"&Last53") > 0 and DOB = "&DOB53"d)
or (index (FIRST_Name,"&First54") > 0 and index (LAST_Name,"&Last54") > 0 and DOB = "&DOB54"d)
or (index (FIRST_Name,"&First55") > 0 and index (LAST_Name,"&Last55") > 0 and DOB = "&DOB55"d)
or (index (FIRST_Name,"&First56") > 0 and index (LAST_Name,"&Last56") > 0 and DOB = "&DOB56"d)
or (index (FIRST_Name,"&First57") > 0 and index (LAST_Name,"&Last57") > 0 and DOB = "&DOB57"d)
or (index (FIRST_Name,"&First58") > 0 and index (LAST_Name,"&Last58") > 0 and DOB = "&DOB58"d)
or (index (FIRST_Name,"&First59") > 0 and index (LAST_Name,"&Last59") > 0 and DOB = "&DOB59"d)
or (index (FIRST_Name,"&First60") > 0 and index (LAST_Name,"&Last60") > 0 and DOB = "&DOB60"d)
or (index (FIRST_Name,"&First61") > 0 and index (LAST_Name,"&Last61") > 0 and DOB = "&DOB61"d)
or (index (FIRST_Name,"&First62") > 0 and index (LAST_Name,"&Last62") > 0 and DOB = "&DOB62"d)
or (index (FIRST_Name,"&First63") > 0 and index (LAST_Name,"&Last63") > 0 and DOB = "&DOB63"d)
or (index (FIRST_Name,"&First64") > 0 and index (LAST_Name,"&Last64") > 0 and DOB = "&DOB64"d)
or (index (FIRST_Name,"&First65") > 0 and index (LAST_Name,"&Last65") > 0 and DOB = "&DOB65"d)
or (index (FIRST_Name,"&First66") > 0 and index (LAST_Name,"&Last66") > 0 and DOB = "&DOB66"d)
or (index (FIRST_Name,"&First67") > 0 and index (LAST_Name,"&Last67") > 0 and DOB = "&DOB67"d)
or (index (FIRST_Name,"&First68") > 0 and index (LAST_Name,"&Last68") > 0 and DOB = "&DOB68"d)
or (index (FIRST_Name,"&First69") > 0 and index (LAST_Name,"&Last69") > 0 and DOB = "&DOB69"d)
or (index (FIRST_Name,"&First70") > 0 and index (LAST_Name,"&Last70") > 0 and DOB = "&DOB70"d)
or (index (FIRST_Name,"&First71") > 0 and index (LAST_Name,"&Last71") > 0 and DOB = "&DOB71"d)
or (index (FIRST_Name,"&First72") > 0 and index (LAST_Name,"&Last72") > 0 and DOB = "&DOB72"d)
or (index (FIRST_Name,"&First73") > 0 and index (LAST_Name,"&Last73") > 0 and DOB = "&DOB73"d)
or (index (FIRST_Name,"&First74") > 0 and index (LAST_Name,"&Last74") > 0 and DOB = "&DOB74"d)
or (index (FIRST_Name,"&First75") > 0 and index (LAST_Name,"&Last75") > 0 and DOB = "&DOB75"d)
or (index (FIRST_Name,"&First76") > 0 and index (LAST_Name,"&Last76") > 0 and DOB = "&DOB76"d)
or (index (FIRST_Name,"&First77") > 0 and index (LAST_Name,"&Last77") > 0 and DOB = "&DOB77"d)
or (index (FIRST_Name,"&First78") > 0 and index (LAST_Name,"&Last78") > 0 and DOB = "&DOB78"d)
or (index (FIRST_Name,"&First79") > 0 and index (LAST_Name,"&Last79") > 0 and DOB = "&DOB79"d)
or (index (FIRST_Name,"&First80") > 0 and index (LAST_Name,"&Last80") > 0 and DOB = "&DOB80"d)
or (index (FIRST_Name,"&First81") > 0 and index (LAST_Name,"&Last81") > 0 and DOB = "&DOB81"d)
or (index (FIRST_Name,"&First82") > 0 and index (LAST_Name,"&Last82") > 0 and DOB = "&DOB82"d)
or (index (FIRST_Name,"&First83") > 0 and index (LAST_Name,"&Last83") > 0 and DOB = "&DOB83"d)
or (index (FIRST_Name,"&First84") > 0 and index (LAST_Name,"&Last84") > 0 and DOB = "&DOB84"d)
or (index (FIRST_Name,"&First85") > 0 and index (LAST_Name,"&Last85") > 0 and DOB = "&DOB85"d)
or (index (FIRST_Name,"&First86") > 0 and index (LAST_Name,"&Last86") > 0 and DOB = "&DOB86"d)
or (index (FIRST_Name,"&First87") > 0 and index (LAST_Name,"&Last87") > 0 and DOB = "&DOB87"d)
or (index (FIRST_Name,"&First88") > 0 and index (LAST_Name,"&Last88") > 0 and DOB = "&DOB88"d)
or (index (FIRST_Name,"&First89") > 0 and index (LAST_Name,"&Last89") > 0 and DOB = "&DOB89"d)
or (index (FIRST_Name,"&First90") > 0 and index (LAST_Name,"&Last90") > 0 and DOB = "&DOB90"d)
or (index (FIRST_Name,"&First91") > 0 and index (LAST_Name,"&Last91") > 0 and DOB = "&DOB91"d)
or (index (FIRST_Name,"&First92") > 0 and index (LAST_Name,"&Last92") > 0 and DOB = "&DOB92"d)
or (index (FIRST_Name,"&First93") > 0 and index (LAST_Name,"&Last93") > 0 and DOB = "&DOB93"d)
or (index (FIRST_Name,"&First94") > 0 and index (LAST_Name,"&Last94") > 0 and DOB = "&DOB94"d)
or (index (FIRST_Name,"&First95") > 0 and index (LAST_Name,"&Last95") > 0 and DOB = "&DOB95"d)
or (index (FIRST_Name,"&First96") > 0 and index (LAST_Name,"&Last96") > 0 and DOB = "&DOB96"d)
or (index (FIRST_Name,"&First97") > 0 and index (LAST_Name,"&Last97") > 0 and DOB = "&DOB97"d)
or (index (FIRST_Name,"&First98") > 0 and index (LAST_Name,"&Last98") > 0 and DOB = "&DOB98"d)
or (index (FIRST_Name,"&First99") > 0 and index (LAST_Name,"&Last99") > 0 and DOB = "&DOB99"d)
or (index (FIRST_Name,"&First100") > 0 and index (LAST_Name,"&Last100") > 0 and DOB = "&DOB100"d)
or (index (FIRST_Name,"&First101") > 0 and index (LAST_Name,"&Last101") > 0 and DOB = "&DOB101"d)
or (index (FIRST_Name,"&First102") > 0 and index (LAST_Name,"&Last102") > 0 and DOB = "&DOB102"d)
or (index (FIRST_Name,"&First103") > 0 and index (LAST_Name,"&Last103") > 0 and DOB = "&DOB103"d)
or (index (FIRST_Name,"&First104") > 0 and index (LAST_Name,"&Last104") > 0 and DOB = "&DOB104"d)
or (index (FIRST_Name,"&First105") > 0 and index (LAST_Name,"&Last105") > 0 and DOB = "&DOB105"d)
or (index (FIRST_Name,"&First106") > 0 and index (LAST_Name,"&Last106") > 0 and DOB = "&DOB106"d)
or (index (FIRST_Name,"&First107") > 0 and index (LAST_Name,"&Last107") > 0 and DOB = "&DOB107"d)
or (index (FIRST_Name,"&First108") > 0 and index (LAST_Name,"&Last108") > 0 and DOB = "&DOB108"d)
or (index (FIRST_Name,"&First109") > 0 and index (LAST_Name,"&Last109") > 0 and DOB = "&DOB109"d)
or (index (FIRST_Name,"&First110") > 0 and index (LAST_Name,"&Last110") > 0 and DOB = "&DOB110"d)
or (index (FIRST_Name,"&First111") > 0 and index (LAST_Name,"&Last111") > 0 and DOB = "&DOB111"d)
or (index (FIRST_Name,"&First112") > 0 and index (LAST_Name,"&Last112") > 0 and DOB = "&DOB112"d)
or (index (FIRST_Name,"&First113") > 0 and index (LAST_Name,"&Last113") > 0 and DOB = "&DOB113"d)
or (index (FIRST_Name,"&First114") > 0 and index (LAST_Name,"&Last114") > 0 and DOB = "&DOB114"d)
or (index (FIRST_Name,"&First115") > 0 and index (LAST_Name,"&Last115") > 0 and DOB = "&DOB115"d)
or (index (FIRST_Name,"&First116") > 0 and index (LAST_Name,"&Last116") > 0 and DOB = "&DOB116"d)
or (index (FIRST_Name,"&First117") > 0 and index (LAST_Name,"&Last117") > 0 and DOB = "&DOB117"d)
or (index (FIRST_Name,"&First118") > 0 and index (LAST_Name,"&Last118") > 0 and DOB = "&DOB118"d)
or (index (FIRST_Name,"&First119") > 0 and index (LAST_Name,"&Last119") > 0 and DOB = "&DOB119"d)
or (index (FIRST_Name,"&First120") > 0 and index (LAST_Name,"&Last120") > 0 and DOB = "&DOB120"d)
or (index (FIRST_Name,"&First121") > 0 and index (LAST_Name,"&Last121") > 0 and DOB = "&DOB121"d)
or (index (FIRST_Name,"&First122") > 0 and index (LAST_Name,"&Last122") > 0 and DOB = "&DOB122"d)
or (index (FIRST_Name,"&First123") > 0 and index (LAST_Name,"&Last123") > 0 and DOB = "&DOB123"d)
or (index (FIRST_Name,"&First124") > 0 and index (LAST_Name,"&Last124") > 0 and DOB = "&DOB124"d)
or (index (FIRST_Name,"&First125") > 0 and index (LAST_Name,"&Last125") > 0 and DOB = "&DOB125"d)
or (index (FIRST_Name,"&First126") > 0 and index (LAST_Name,"&Last126") > 0 and DOB = "&DOB126"d)
or (index (FIRST_Name,"&First127") > 0 and index (LAST_Name,"&Last127") > 0 and DOB = "&DOB127"d)
or (index (FIRST_Name,"&First128") > 0 and index (LAST_Name,"&Last128") > 0 and DOB = "&DOB128"d)
or (index (FIRST_Name,'&First129') > 0 and index (LAST_Name,'&Last129') > 0 and DOB = "&DOB129"d)
or (index (FIRST_Name,'&First130') > 0 and index (LAST_Name,'&Last130') > 0 and DOB = "&DOB130"d)
or (index (FIRST_Name,'&First131') > 0 and index (LAST_Name,'&Last131') > 0 and DOB = "&DOB131"d)
;
run;
Super Contributor
Super Contributor
Posts: 3,174

Re: Please help automate this process w/o hard coding.

A better suggestion is to explain what you need to accomplish aside from the code snippet you have provided.

Honestly, from the code perspective, it appears that you are comparing the first 3 characters of each observation's Last_Name (not each DISTINCT prefix substring) and the first 1 character of each observations First_Name (again not DISTINCT) against (scanning) the entire Last_Name and First_Name but only for an particular date. For me, this doesn't make reasonable sense, so maybe you might want to explain what you are trying to accomplish for most effective feedback.

Scott Barry
SBBWorks, Inc.
N/A
Posts: 0

Re: Please help automate this process w/o hard coding.

I had to increase the First variable to substring 3 characters instead of the too restrictive 1.

Basically I want to match members from one dataset to another. The problem is our clients use their own unique identifier and we have our own. (Since we have thousands of clients, agreeing upon a universal unique identifier is cumbersome) Therefore, I have to resort to some primitave matching algorithm. Matching on the entire first name and last name is too restrictive and will miss many matches b/c of inherent misspellings, middle name abbreviation appends, and surname suffix appends to either the first or last name. Therefore I resorted to substringing part of the first and last name and requiring an exact match on birth date. This method works fine, however I need to automate this somehow as I expect many more client lists to match.
Super Contributor
Super Contributor
Posts: 3,174

Re: Please help automate this process w/o hard coding.

Personal opinion: you're worse off with the substring-prefix and INDEX scan attempt, frankly. Suggest you attempt to treat the problem by determining some key to match the data sources. That's all I can recommend, in good conscience.

Scott Barry
SBBWorks, Inc.
Valued Guide
Posts: 2,111

Re: Please help automate this process w/o hard coding.

Like Scott said, you need to be more descriptive on your data structures and goals than just providing code. Perhaps this, untested code, might work:

data medemerge_miss2;
set diab.medemerge_miss;
Key= substr(Last_Name,1,3)||substr(First_Name,1,1)||PUT(dob,mmddyy10.);
KEEP key;
run;

data medemerge2;
set diab.member_extract;
Key= substr(Last_Name,1,3)||substr(First_Name,1,1)||PUT(dob,mmddyy10.);
run;

PROC SQL;
CREATE TABLE diab.mdemerge3 AS
SELECT *
FROM diab.member_extract AS e,
medemerge_miss2 AS m
WHERE e.key=m.key;
QUIT; RUN;

The idea was to create a unique key in each table that can be used in join. You might be able to do it directly with the SUBSTR's in the SQL, but there may also be a performance penalty there.
Ask a Question
Discussion stats
  • 4 replies
  • 440 views
  • 0 likes
  • 3 in conversation