Dear all,
I have two tables A and B
data a; subj=1; dat='HI_76_KAM_BAK_887_PD34_KAR'; output; subj=2; dat='FT_01_S_NDRI_KAR'; output; subj=3; dat='DY_02_DEALIRITEON_843_KAR'; output; subj=4; dat='DY_02_RET_DEALIRITEON_ST_843_KAR'; output; subj=5; dat='BV_719_B_SIAFT_JH_500_KAR'; output; subj=6; dat='BV_303_B_SIAFT_500_KAR'; output; subj=7; dat='BV_711_B_SIAFT_KAR'; output; subj=8; dat='BV_714_XEMEYDE5_CESIC_890_CAS_KAR'; output; subj=9; dat='BV_714_MIL_XMD5_CSIC_CAS_KAR'; output; subj=10; dat='BV_714_TJIRSIC_CAS_KAR'; output; subj=11; dat='DY_30_KAM_BAK_YR_956_KAR'; output; subj=12; dat='DY_30_PIC_KAM_BAK_956_KAR'; output; subj=13; dat='AM_07_MH_KAR'; output; subj=14; dat='AM_53_JIDIDUM_DYDERS_KAR'; output; run; data b; id=1; dat_new='KAM_BAK_887'; other_var=1; output; id=2; dat_new='S_NDRI'; other_var=2; output; id=3; dat_new='DEALIRITEON_843'; other_var=3; output; id=4; dat_new='B_SIAFT_500'; other_var=11; output; id=5; dat_new='XEMEYDE5_CESIC_CAS'; other_var=21; output; id=6; dat_new='KAM_BAK_956'; other_var=25; output; id=7; dat_new='MH'; other_var=21; other_var=28; output; id=7; dat_new='JIDIDUM_DYDERS'; other_var=29; output; run;
I need to appropriately join them and get the table WANT as follows:
data want; dat=HI_76_KAM_BAK_887_PD34_KAR; dat_new=KAM_BAK_887;output; dat=FT_01_S_NDRI_KAR; dat_new=S_NDRI;output; dat=DY_02_DEALIRITEON_843_KAR; dat_new=DEALIRITEON_843;output; dat=BV_303_B_SIAFT_500_KAR; dat_new=B_SIAFT_500;output; dat=BV_714_MIL_XMD5_CSIC_CAS_KAR; dat_new=XEMEYDE5_CESIC_CAS;output; dat=DY_30_PIC_KAM_BAK_956_KAR; dat_new=KAM_BAK_956;output; dat=AM_07_MH_KAR; dat_new=MH;output; dat=AM_53_JIDIDUM_DYDERS_KAR; dat_new=JIDIDUM_DYDERS;output; run;
I have tried to use something like this within a PROC SQL join but I have failed.
ifc(indexw(strip(dat),strip(dat_new),'_'),'OK','NOT OK');
Any suggestions would be more than welcome.
Thanking you in advance.
Note: I have modified the code for A and B to include LENGTH statements.
data a;
length dat $ 40;
subj=1; dat='HI_76_KAM_BAK_887_PD34_KAR'; output;
subj=2; dat='FT_01_S_NDRI_KAR'; output;
subj=3; dat='DY_02_DEALIRITEON_843_KAR'; output;
subj=4; dat='DY_02_RET_DEALIRITEON_ST_843_KAR'; output;
subj=5; dat='BV_719_B_SIAFT_JH_500_KAR'; output;
subj=6; dat='BV_303_B_SIAFT_500_KAR'; output;
subj=7; dat='BV_711_B_SIAFT_KAR'; output;
subj=8; dat='BV_714_XEMEYDE5_CESIC_890_CAS_KAR'; output;
subj=9; dat='BV_714_MIL_XMD5_CSIC_CAS_KAR'; output;
subj=10; dat='BV_714_TJIRSIC_CAS_KAR'; output;
subj=11; dat='DY_30_KAM_BAK_YR_956_KAR'; output;
subj=12; dat='DY_30_PIC_KAM_BAK_956_KAR'; output;
subj=13; dat='AM_07_MH_KAR'; output;
subj=14; dat='AM_53_JIDIDUM_DYDERS_KAR'; output;
run;
data b;
length dat_new $ 24;
id=1; dat_new='KAM_BAK_887'; other_var=1; output;
id=2; dat_new='S_NDRI'; other_var=2; output;
id=3; dat_new='DEALIRITEON_843'; other_var=3; output;
id=4; dat_new='B_SIAFT_500'; other_var=11; output;
id=5; dat_new='XEMEYDE5_CESIC_CAS'; other_var=21; output;
id=6; dat_new='KAM_BAK_956'; other_var=25; output;
id=7; dat_new='MH'; other_var=21; other_var=28; output;
id=7; dat_new='JIDIDUM_DYDERS'; other_var=29; output;
run;
proc sql;
create table want as select
a.dat,
b.dat_new
from a left join b
on find(a.dat,b.dat_new,'t')>=1
where not missing(b.dat_new);
quit;
The reason that INDEXW (and also FINDW) will not work is that in record one, the words it looks at with the '_' delimiter are HI 76 KAM BAK 887 etc and it never finds one that matches KAM_BAK.
The approach in my code will fail if in data set A, there are values of DAT that contain MH, for example, if DAT='AMHERST' this will match data set b ID=7 DAT_NEW='MH'. I do not have at this moment a way to get around this.
Maybe regular expressions will have a better solution, but I leave that to others.
What is the matching rule you wish to apply?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.