/* is there a way to have upcase(key:) in order to not have case issues in the key variables */
/* with SQL, the solution is :
upcase(a.key)=upcase(b.key)
*/
data want (drop=rc);
/*iterate left data set*/
set have ;
/*declare variables from hash set*/
length rate_1 rc 8 ;
/*declare hash*/
if _n_=1 then
do;
declare hash hhh(dataset: "table_ref (keep=rate_1 BUSINESS_AREA )", multidata:'y');
hhh.DefineKey('BUSINESS_AREA');
hhh.DefineData(ALL:'YES');
hhh.DefineDone();
end;
do until(DerniereObservation);
set have end=DerniereObservation;
call missing(rate_1);
rc = hhh.find(key:BUSINESS_AREA);
if rc ne 0 then
do;
var =.;
output;
end;
do while (rc=0);
var =rate_1*10;
output;
rc = hhh.find_next();
end;
end;
run;
Assuming this is about the casing of the actual values and not the variable names holding the values.
Any string operation will have some impact on performance - but changing the case to align your values is something you can't avoid.
Any read/write operation will impact on performance - you can code to minimize passes through the data. Below an approach how to do so.
data ref;
set sashelp.class(keep=name age);
name=upcase(name);
run;
data have;
set sashelp.class(drop=age);
name=propcase(name);
run;
data want(drop=_:);
if _n_=1 then
do;
/* declare the hash */
dcl hash h1();
h1.defineKey('name');
h1.defineData('age');
h1.defineDone();
/* load the hash */
do until(last);
set ref end=last;
name=lowcase(name);
h1.ref();
end;
end;
set have;
_name_lower=lowcase(name);
_rc=h1.find(key:_name_lower);
run;
Can you not just before that step:
data have; set have; key=upcase(key); run;
You might be able to do it in the same step as the hash, not sure as never used hash.
You may up case the KEY before loading the hash table. Since there will be few up cases in the KEY, it will be slightly efficient to use low case for there will be less work for the function.
cheers
DATASP
It is not possible to do what you want.
If you can't upcase the values in the source table before loading it in the hash table, 2 options:
1 - upcase them as you go and load the hash table row by row with hhh.add()
2 - If there are only 2 possible cases, try to find twice: hhh.find() and hhh.find(key:upcase(BUSINESS_AREA))
Your question indicates messy data though; strive for clean data and you wan't have to worry about this kind of issue.
oh ok . it is not possible to apply the same as in SQL meaning upcase in the 2 tables directly :(.
1. so i need to modify before the 2 tables for all characters variables concerned . i did'nt want to do this ,i was thinking that it could be possible in the HASH ( more efficient ? ) ... i wanted to try something directly in the hash function.
2. there is more than 2 possibilities ,
hhh.find(key:upcase(BUSINESS_AREA)) only upcase in the second table ( table_ref2) , not in the set table .
if _n_=1 then
do;
declare hash hhh(dataset: "table_ref2 (keep=rate_1 BUSINESS_AREA )", multidata:'y');
hhh.DefineKey('BUSINESS_AREA');
hhh.DefineData(ALL:'YES');
hhh.DefineDone();
end;
do until(DerniereObservation);
set have2 end=DerniereObservation;
call missing(rate_1);
rc = hhh.find(key:BUSINESS_AREA);
rc2= hhh.find(key:lowcase(BUSINESS_AREA));
Thanks for your responses,
i have many character variables in the key ( here an example with 2 ) and the case issue concern the 2 tables ( the orignal table and the ref table i want to join ) .
for example i can have in the original table : All_buisness and in the ref table ALL_BUISNESS , so i need to apply either upcase/lowcase in the first but also in th second table
i found a solution by creating SAS datasets views ..... but do you know the impact in performance ... ( big datasets ...)
data table_ref2 / view=table_ref2;
set table_ref;
BUSINESS_AREA = lowcase(BUSINESS_AREA);
run;
data have2 / view=have2;
set have;
BUSINESS_AREA = lowcase(BUSINESS_AREA);
run;
data have3(drop=rc);
/*iterate left data set*/
set have2 ;
/*declare variables from hash set*/
length rate_1 rc 8 ;
/*declare hash*/
if _n_=1 then
do;
declare hash hhh(dataset: "table_ref2(keep=rate_1 BUSINESS_AREA TYPE_OF_POL )", multidata:'y');
hhh.DefineKey('BUSINESS_AREA','TYPE_OF_POL');
hhh.DefineData(ALL:'YES');
hhh.DefineDone();
end;
do until(DerniereObservation);
set have2 end=DerniereObservation;
call missing(rate_1);
rc = hhh.find(key:BUSINESS_AREA,key:TYPE_OF_POL);
if rc ne 0 then
do;
var =.;
output;
end;
do while (rc=0);
var =rate_1*10;
output;
rc = hhh.find_next();
end;
end;
run;
i am not talking avout SAS names but the values of the variables
>i am not talking avout SAS names but the values of the variables
This was not obvious from your confusing answer:
for example i can have in the original table : All_buisness and in the ref table ALL_BUISNESS , so i need to apply either upcase/lowcase in the first but also in th second table
Assuming this is about the casing of the actual values and not the variable names holding the values.
Any string operation will have some impact on performance - but changing the case to align your values is something you can't avoid.
Any read/write operation will impact on performance - you can code to minimize passes through the data. Below an approach how to do so.
data ref;
set sashelp.class(keep=name age);
name=upcase(name);
run;
data have;
set sashelp.class(drop=age);
name=propcase(name);
run;
data want(drop=_:);
if _n_=1 then
do;
/* declare the hash */
dcl hash h1();
h1.defineKey('name');
h1.defineData('age');
h1.defineDone();
/* load the hash */
do until(last);
set ref end=last;
name=lowcase(name);
h1.ref();
end;
end;
set have;
_name_lower=lowcase(name);
_rc=h1.find(key:_name_lower);
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.