Maybe a DATA step approach is better. It does require sorting of the two input tables though. Try this:
Data T1;
INFILE DATALINES DSD;
Input ItemNo $ ItemC1 $ ItemC2 $;
Datalines;
"A10" , "MRF" , "Ceat"
"A100" , "MRF" , "Ceat"
"B01" , "IRP" , ""
"B011" , "IRP" , "CRO"
;
Data T2;
INFILE DATALINES DSD;
Input ItemNo $ (ItemC1 - ItemC10) ($);
Datalines;
"A101", "MRF", "", "", "Ceat", "", "", "", "", "", ""
"A100", "MRF", "", "Ceat" , "" , "" , "", "", "", "",""
"B01", "IRP" , "" ,"", "", "CRO","","","","",""
"B0101", "IRP", "CRO" ,"","","","","","","",""
;
proc sort data = T1;
by ItemNo;
run;
proc sort data = T2;
by ItemNo;
run;
data Want;
keep ItemNo T1ItemC1 T1ItemC2 ItemC1 - ItemC10;
retain T1ItemC1 T1ItemC2;
set T1 T2;
by ItemNo;
array T2 (*) $ ItemC1 - ItemC10;
if not (first.ItemNo and last.ItemNo) then do;
if first.ItemNo then do;
T1ItemC1 = ItemC1; T1ItemC2 = ItemC2;
end;
else do i = 1 to dim(T2);
if T1ItemC1 = T2(i) then ItemC1_Match = 'Y';
if T1ItemC2 = T2(i) then ItemC2_Match = 'Y';
end;
if ItemC1_Match = 'Y' and ItemC2_Match = 'Y' then output;
end;
run;
Thanks for your help. However, your code didn't provide me right results.
@Pandu2 - If you want to continue with this approach then please explain what "not the right results" means.
Your code is just matching the T1 ItemNo with T2 ItemNo and providing values of T2 that doesn't match with T1 ItemC1, ItemC2 with T2 ItemC1 - ItemC10.
This is the output I get when I run my posted code:
What should I be getting to agree with your requirements?
Row 1 is a match for both C1 and C2. Row 2 is also a match for C1 and C2 but should C2 being blank count?
You data steps aren't going to work as written. So first let's clean that up. We don't need 10 variables to demonstrate the problem so let's cut that down to just 5.
data T1;
input ItemNo :$10. (ItemC1 ItemC2) (:$8.);
datalines;
A10 MRF Ceat
A100 MRF Ceat
B01 IRP .
B011 IRP CRO
;
data T2;
input ItemNo :$10. (ItemC1 - ItemC5) (:$8.);
datalines;
A101 MRF . . Ceat .
A100 MRF . Ceat . .
B01 IRP . . . CRO
B0101 IRP CRO . . .
;
It looks like you just want to find the observations in T1 that match so let's do an INNER JOIN.
proc sql;
create table want as
select a.*
from t1 a
inner join t2 b
on a.itemno=b.itemno
and (whichc(a.itemc1,b.itemc1,b.itemc2,b.itemc3,b.itemc4,b.itemc5)
or whichc(a.itemc2,b.itemc1,b.itemc2,b.itemc3,b.itemc4,b.itemc5)
)
;
quit;
Results
Item Item Item Obs No C1 C2 1 A100 MRF Ceat 2 B01 IRP
I'm very much thankful for responding. However, the output table what should look like is :
Obs ItemNo ItemC1 ItemC2 ItemC3 ItemC4 ItemC5 1 A100 MRF . CEAT 2 B01 IRP . . . . This is how my output should look like, if the value of ItemC1, ItemC2 of T1 matches with any column of T2 then, it should output those matched values in whatever place they exist in T2.
So you want the records from T2 instead of the records from T1.
That should be TRIVIAL to fix.
Did you try to fix it?
What happened.
Yeah, I tried to fix it but the code is just matching the T1 ItemNo with T2 ItemNo and providing values of T2 that doesn't match with T1 ItemC1, ItemC2 with T2 ItemC1 - ItemC10.
Works fine for me:
Item Item Item Item Item Item Obs No C1 C2 C3 C4 C5 1 A100 MRF Ceat 2 B01 IRP CRO
Change made:
select b.*
I've cross checked your code by changing values in Table T1. Here's the output what it shouldn't give.
Data T1; Input ItemNo :$10. (ItemC1 ItemC2) (:$8.); Datalines; A101 MRF CEAT A100 MRF . B01 . . B011 IRP CRO ;
Below is my output, here I just made changes in Table T1 and table T2 remained same. As you can see there are only two ItemNo's matched
A100 and B01 . In my output it should only give MRF for A100 besides it has also given ceat which isn't present in Table T1 for A100 ItemNo. Likewise B01.
I request you to please help me out on this. Thankyou!.
That is a totally different request than the way I understood the request before.
It seems you want to select values of the ITEM variables instead of OBSEVATIOSN from the dataset.
If that is right then you really should take the time to convert from your WIDE format to a TALL format and then the coding will be MUCH easier. Since you mentioned at one point a concern about the size of the data perhaps you could do the conversions as a VIEW instead of a physical table. That should save disk space (and possibly also time).
data t1_tall / view=t1_tall ;
set t1;
array items itemc:;
do index=1 to dim(items);
itemc = items[index];
if not missing(itemc) then output;
end;
keep itemno itemc ;
run;
data t2_tall / view=t2_tall ;
set t2;
array items itemc:;
do index=1 to dim(items);
itemc = items[index];
if not missing(itemc) then output;
end;
keep itemno itemc ;
run;
proc sql ;
create table want_tall as
select t2.itemno,t2.itemc
from t2_tall t2 inner join t1_tall t1
on t2.itemno = t1.itemno and t2.itemc = t1.itemc
;
quit;
Otherwise you will need to revert to using a data step. That will be easy if
1) T1 has only one observation per ITEMNO
2) T1 and T2 are sorted by ITEMNO
So basically loop over the long list of index variables and check if any of them are in the set of requested values.
data want;
merge t2 t1(rename=(itemc1=check1 itemc2=check2));
by itemno;
array items itemc: ;
do index=1 to dim(items);
if missing(items[index]) then continue;
if whichc(items[index] , check1 , check2) then found=1;
else items[index]=' ';
end;
if found;
drop index found check1 check2;
run;
I used your datastep code which has merging in it.
Here I'm gonna post you the snapshots of my output and changes made to Table T1 to test your code.
As you can see the Table T3(T1) for the ItemNo A100 it has MRF & CEAT. Both the values present in Table T4 (T2) but the output table isn't showing CEAT in ItemC3 since it is present in ItemC3 of Table T4. Could please test it out from your side as well by changing values in T3(T1).
Your photograph is showing one observation with 'Ceat' and another observation with 'ceat'. Those are two different values. If you want those to match you would have to convert both to either CEAT or ceat by using UPCASE() or lowcase() functions.
Could your code work if table T2 has this kinda data, if it doesn't may I know how to make it work.
data T2; input ItemNo :$10. (ItemC1 - ItemC5) (:$8.); datalines; A101 MRF:123 . . Ceat:789 . A100 MRF:098 . Ceat:453 . . B01 IRP:126 . . . CRO:670 B0101 IRP:309 CRO:012 . . . ; Same rules and same concept.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.