Dear SAS Folks, I seek your help in solving my problem that requires a look up solution of row to column values or column to row for my given wide datasets. My datasets have values like the following:
Table A:
ID | Type1 | type2 | type3 | type4 | type5 | type6 | type7 | type8 | type9 | type10 | type11 | type12 | type13 | type14 | type15 |
ABC01 | 25 | N | |||||||||||||
ABC01 | 25 | N | A1 | ||||||||||||
ABC01 | 11 | Y | A5 | ||||||||||||
ABC01 | 55 | k | T1 | ||||||||||||
JKL03 | 39 | N | A5 | ||||||||||||
JKL03 | 41 | Y | A5 | ||||||||||||
JKL03 | 40 | N | T1 | ||||||||||||
JKL03 | 39 | Y | A1 |
Table B:
ID | Type1 | type2 | type3 | type4 | type5 | type6 | type7 | type8 | type9 | type10 | type11 | type12 | type13 | type14 | type15 |
ABC01 | 25,11,35,45 | N,Y | T1,A1,A5 | ||||||||||||
JKL03 | 39,40 | N | T1,A1 |
Desired Results:
ID | Type1 | type2 | type3 | type4 | type5 | type6 | type7 | type8 | type9 | type10 | type11 | type12 | type13 | type14 | type15 |
ABC01 | 25 | N | A1 | ||||||||||||
ABC01 | 11 | Y | A5 | ||||||||||||
JKL03 | 40 | N | T1 |
The logic for the look up is as follows:
Can somebody help me with this challenge? It seems too daunting.
Many thanks,
Charlotte
Try with hash:
data want;
if 0 then set B(rename=(type1=_type1 type2=_type2 type3=_type3));
if _n_=1 then do;
declare hash h(dataset:'B(rename=(type1=_type1 type2=_type2 type3=_type3)');
h.definekey('ID');
h.definedata(all:'y');
h.definedone();
end;
set A;
if h.find()=0 then do;
if index(_type1,strip(type1))>0 and
index(_type2,strip(type2))>0 and
index(_type3,strip(type3))>0 then output;
end;
drop _:;
run;
Thanks @slchenfish and @billfish I really appreciate your time and your help. That really means a lot. Have a nice day
Hi, the code seems to fail when in an event of both _type3 and type3 have missing values. When ideally, the condition should equate to true because missing=missing is true. I'd appreciate your advice.
A solution amongst others.
I will denote table A as t_a.
I will denote table B as t_b.
I will denote table Desired Results as t_results.
I assume that t_b has 1 row per id and t_a may have several rows per id.
I assume that all id's in t_b are also present in t_a.
I assume that t_a and t_b are already sorted by id;
/*************************************/
/**** randomized sample table t_a ****/
/*************************************/
data t_a(keep=id type1-type15);
length id 8.;
array type(15) $2.;
do id=1 to 10;
AA= 5+int(15*ranuni(7));
do i = 1 to aa;
do j=1 to 15;
type(j)= put(ceil(3*ranuni(7)),$2.);
end;
output;
end;
end;
run;
/*************************************/
/**** randomized sample table t_b ****/
/*************************************/
data t_b(keep=id type1-type15);
length id 8.;
array type(15) $12.;
do id=1 to 10;
do j = 1 to 15;
aa = ceil(6*ranuni(3));
a1 = ceil(3*ranuni(3));
if (aa in (3,4,5,6)) then type(j) = '1,2,3';
if (aa = 2) and (a1=1) then type(j) = '2,3';
if (aa = 2) and (a1=2) then type(j) = '1,3';
if (aa = 2) and (a1=3) then type(j) = '1,2';
if (aa = 1) then type(j)=put(a1, $2.);
end;
output;
end;
run;
/*****************************************************/
/**** finding the records of t_a which match with ****/
/**** at least 1 element of all 15 type(*) of t_b ****/
/*****************************************************/
data t_results(keep=id type1-type15);
length id 8.;
array zTyp(15) $12.;
array type(15) $2.;
do until(last.id);
set t_b(rename=(type1-type15 = zTyp1-zTyp15));
by id;
end;
do until(last.id);
set t_a;
by id;
if first.id then do; zResult=0; end;
do i = 1 to 15;
z1 = sign(findw(zTyp(i),strip(type(i))));
zResult+(z1);
end;
if zResult=15 then do; output; end;
zResult=0;
end;
run;
The resulting table t_results (these records came from t_a):
=================================================================================================================================
id type1 type2 type3 type4 type5 type6 type7 type8 type9 type10 type11 type12 type13 type14 type15
1 2 2 2 1 1 3 1 3 1 1 1 1 2 1 3
1 2 1 3 2 1 2 3 3 1 3 1 3 2 2 3
1 1 1 3 3 1 3 1 1 2 3 1 2 1 1 1
4 1 2 2 3 2 1 2 2 1 2 2 3 2 2 3
=================================================================================================================================
Hope this helps.
Hi, Dear Charlotte :
I think @slchen make some sense . But you'd better use indexw( , ',') or findw( ,',') to avoid to unnecessary error.
Dear Xia, Thank you as always. I was actually thinking of writing to you but I wasn't sure of what time would it be in Beijing and how busy you are during your day while I guess it would be early hours in the morning for me here in England. Right now, it is 4:10 pm here.
Should i simply replace Index with indexw. I am wondering about the ','? Can you please illustrate this line in the example provided by Slchen.
How are you enjoying summer?yYou should probably visit England .
Many thanks my dear friend and well wisher.
Sincerely,
Charlotte
Dear Charlotte,
There is eight hours between you and me .
Nothing for me in this summary, just reading ,learning ......
I'd like to visit England ,like to see big bell,But I have no money . Hope you could visit China either .
Now come back to question .
I think @slchen 's code could be changed like this:
if ( index(_type1,strip(type1))>0 or (missing(_type1) and missing(type1))) and
( index(_type2,strip(type2))>0 or (missing(_type1) and missing(type1))) and
( index(_type3,strip(type3))>0 or (missing(_type1) and missing(type1))) then output;
Here is my code:
data A;
input ID $ / Type1 $ / type2 $ / type3 $;
cards;
ABC01
25
N
.
ABC01
25
N
A1
ABC01
11
Y
A5
ABC01
55
k
T1
JKL03
39
N
A5
JKL03
41
Y
A5
JKL03
40
N
T1
JKL03
39
Y
A1
;
run;
data B;
input ID $ / Type1 $20. / type2 $20. / type3 $20.;
cards;
ABC01
25,11,35,45
N,Y
T1,A1,A5
JKL03
39,40
N
T1,A1
;
run;
data want;
merge A B(rename=(type1-type3=_t1-_t3));
by id;
array x{*} $ type:;
array y{*} $ _t:;
matched=1;
do i=1 to dim(x);
if not findw(y{i},strip(x{i})) then matched=0;
if missing(x{i}) and missing(y{i}) then matched=1;
if matched=0 then leave;
end;
if matched then output;
drop i matched _t:;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.