Dear SAS Folks, I seek your help in solving my problem that requires a look up solution of row to column values or column to row for my given wide datasets. My datasets have values like the following:
Table A:
ID | Type1 | type2 | type3 | type4 | type5 | type6 | type7 | type8 | type9 | type10 | type11 | type12 | type13 | type14 | type15 |
ABC01 | 25 | N | |||||||||||||
ABC01 | 25 | N | A1 | ||||||||||||
ABC01 | 11 | Y | A5 | ||||||||||||
ABC01 | 55 | k | T1 | ||||||||||||
JKL03 | 39 | N | A5 | ||||||||||||
JKL03 | 41 | Y | A5 | ||||||||||||
JKL03 | 40 | N | T1 | ||||||||||||
JKL03 | 39 | Y | A1 |
Table B:
ID | Type1 | type2 | type3 | type4 | type5 | type6 | type7 | type8 | type9 | type10 | type11 | type12 | type13 | type14 | type15 |
ABC01 | 25,11,35,45 | N,Y | T1,A1,A5 | ||||||||||||
JKL03 | 39,40 | N | T1,A1 |
Desired Results:
ID | Type1 | type2 | type3 | type4 | type5 | type6 | type7 | type8 | type9 | type10 | type11 | type12 | type13 | type14 | type15 |
ABC01 | 25 | N | A1 | ||||||||||||
ABC01 | 11 | Y | A5 | ||||||||||||
JKL03 | 40 | N | T1 |
The logic for the look up is as follows:
Can somebody help me with this challenge? It seems too daunting.
Many thanks,
Charlotte
Try with hash:
data want;
if 0 then set B(rename=(type1=_type1 type2=_type2 type3=_type3));
if _n_=1 then do;
declare hash h(dataset:'B(rename=(type1=_type1 type2=_type2 type3=_type3)');
h.definekey('ID');
h.definedata(all:'y');
h.definedone();
end;
set A;
if h.find()=0 then do;
if index(_type1,strip(type1))>0 and
index(_type2,strip(type2))>0 and
index(_type3,strip(type3))>0 then output;
end;
drop _:;
run;
Thanks @slchenfish and @billfish I really appreciate your time and your help. That really means a lot. Have a nice day
Hi, the code seems to fail when in an event of both _type3 and type3 have missing values. When ideally, the condition should equate to true because missing=missing is true. I'd appreciate your advice.
A solution amongst others.
I will denote table A as t_a.
I will denote table B as t_b.
I will denote table Desired Results as t_results.
I assume that t_b has 1 row per id and t_a may have several rows per id.
I assume that all id's in t_b are also present in t_a.
I assume that t_a and t_b are already sorted by id;
/*************************************/
/**** randomized sample table t_a ****/
/*************************************/
data t_a(keep=id type1-type15);
length id 8.;
array type(15) $2.;
do id=1 to 10;
AA= 5+int(15*ranuni(7));
do i = 1 to aa;
do j=1 to 15;
type(j)= put(ceil(3*ranuni(7)),$2.);
end;
output;
end;
end;
run;
/*************************************/
/**** randomized sample table t_b ****/
/*************************************/
data t_b(keep=id type1-type15);
length id 8.;
array type(15) $12.;
do id=1 to 10;
do j = 1 to 15;
aa = ceil(6*ranuni(3));
a1 = ceil(3*ranuni(3));
if (aa in (3,4,5,6)) then type(j) = '1,2,3';
if (aa = 2) and (a1=1) then type(j) = '2,3';
if (aa = 2) and (a1=2) then type(j) = '1,3';
if (aa = 2) and (a1=3) then type(j) = '1,2';
if (aa = 1) then type(j)=put(a1, $2.);
end;
output;
end;
run;
/*****************************************************/
/**** finding the records of t_a which match with ****/
/**** at least 1 element of all 15 type(*) of t_b ****/
/*****************************************************/
data t_results(keep=id type1-type15);
length id 8.;
array zTyp(15) $12.;
array type(15) $2.;
do until(last.id);
set t_b(rename=(type1-type15 = zTyp1-zTyp15));
by id;
end;
do until(last.id);
set t_a;
by id;
if first.id then do; zResult=0; end;
do i = 1 to 15;
z1 = sign(findw(zTyp(i),strip(type(i))));
zResult+(z1);
end;
if zResult=15 then do; output; end;
zResult=0;
end;
run;
The resulting table t_results (these records came from t_a):
=================================================================================================================================
id type1 type2 type3 type4 type5 type6 type7 type8 type9 type10 type11 type12 type13 type14 type15
1 2 2 2 1 1 3 1 3 1 1 1 1 2 1 3
1 2 1 3 2 1 2 3 3 1 3 1 3 2 2 3
1 1 1 3 3 1 3 1 1 2 3 1 2 1 1 1
4 1 2 2 3 2 1 2 2 1 2 2 3 2 2 3
=================================================================================================================================
Hope this helps.
Hi, Dear Charlotte :
I think @slchen make some sense . But you'd better use indexw( , ',') or findw( ,',') to avoid to unnecessary error.
Dear Xia, Thank you as always. I was actually thinking of writing to you but I wasn't sure of what time would it be in Beijing and how busy you are during your day while I guess it would be early hours in the morning for me here in England. Right now, it is 4:10 pm here.
Should i simply replace Index with indexw. I am wondering about the ','? Can you please illustrate this line in the example provided by Slchen.
How are you enjoying summer?yYou should probably visit England .
Many thanks my dear friend and well wisher.
Sincerely,
Charlotte
Dear Charlotte,
There is eight hours between you and me .
Nothing for me in this summary, just reading ,learning ......
I'd like to visit England ,like to see big bell,But I have no money . Hope you could visit China either .
Now come back to question .
I think @slchen 's code could be changed like this:
if ( index(_type1,strip(type1))>0 or (missing(_type1) and missing(type1))) and
( index(_type2,strip(type2))>0 or (missing(_type1) and missing(type1))) and
( index(_type3,strip(type3))>0 or (missing(_type1) and missing(type1))) then output;
Here is my code:
data A;
input ID $ / Type1 $ / type2 $ / type3 $;
cards;
ABC01
25
N
.
ABC01
25
N
A1
ABC01
11
Y
A5
ABC01
55
k
T1
JKL03
39
N
A5
JKL03
41
Y
A5
JKL03
40
N
T1
JKL03
39
Y
A1
;
run;
data B;
input ID $ / Type1 $20. / type2 $20. / type3 $20.;
cards;
ABC01
25,11,35,45
N,Y
T1,A1,A5
JKL03
39,40
N
T1,A1
;
run;
data want;
merge A B(rename=(type1-type3=_t1-_t3));
by id;
array x{*} $ type:;
array y{*} $ _t:;
matched=1;
do i=1 to dim(x);
if not findw(y{i},strip(x{i})) then matched=0;
if missing(x{i}) and missing(y{i}) then matched=1;
if matched=0 then leave;
end;
if matched then output;
drop i matched _t:;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.