Hi,
I have to compare two variables that both of them contain values in Hebrew : Name and SecurityName. I my opinion Name variable is ok. Because when i create the values by myself and run compare function versus Name variable i get result value 0. The problem with SecurityName variable that comes from excel that i attached. Also i attached dataset.
When i use compare function, the result i got is not correct. Maybe the problem that i read file from xlsx file and proc import doesn't support encoding and in some forums i read that you cannot control the encoding of XLSX format file.
The first and last observations are identical, but the result is not correct.
When i tried to create the same data and values by myself and run the same compare function i got the right result, i.e maybe something in the encoding or in excel files i read.
I read xlsx files using proc import through call execute.
I tried kcompare also but the same results
result = compare(Name, SecurityName);
So what i should do?
Thank you
@AlexeyS wrote:
Please see original post, i added full information, dataset and excel. Thank you
@AlexeyS Thanks! That helped.
Add the following to your code and things will start working:
Securityname=kcompress(Securityname,'00'x);
For some reason your variable SecurityName is padded with hex 00 and though the strings in the two variables are actually always different.
data check;
set test.test;
l1=lengthn(name);
l2=lengthn(Securityname);
h1=put(name,hex72.);
h2=put(Securityname,hex72.);
run;
proc print data=check;
run;
The Excel you've provided is not the source with which you've created the test SAS file. For this reason I couldn't further investigate what might have caused these trailing hex 00 values.
libname test xlsx '....';
data check;
set test.test;
l2=lengthn(Securityname);
h2=put(Securityname,hex72.);
run;
proc print data=check;
run;
Hi AlexeyS,
Could you please provide a sample of values and result you get. Also, could you verify that you are running SAS session with UTF-8 encoding?
Hi,
The first and last observations are identical, but the result is not correct.
When i tried to create the same data and values by myself and run the same compare function i got the right result, i.e maybe something in the encoding or in excel files i read.
I read xlsx files using proc import through call execute.
I tried kcompare also but the same results
result0 = compare(Name, SecurityName);
result1 = compare(strip(Name), strip(SecurityName), 'ln');
result2 = compare(strip(Name), strip(SecurityName), 'nl');
result3 = compare(strip(Name), strip(SecurityName), 'n');
result4 = compare(strip(Name), strip(SecurityName));
When i run
proc options option=encoding;
run;
I got : ENCODING=WHEBREW
So what i should do?
Thank you
yes i tried.
I uploaded some data and results when i run the code. Thank you
Can you please provide your data via a text file (attachment) and not just a screen-shot? This would allow us to actually replicate and then resolve what you observe.
....actually: Provide the SAS table and ideally also the Excel
Yes of course. Please see original post, i added full information, dataset and excel. Thank you
Hi,
The first and last observations are identical, but the result is not correct.
When i tried to create the same data and values by myself and run the same compare function i got the right result, i.e maybe something in the encoding or in excel files i read.
I read xlsx files using proc import through call execute.
I tried kcompare also but the same results
result0 = compare(Name, SecurityName); result1 = compare(strip(Name), strip(SecurityName), 'ln'); result2 = compare(strip(Name), strip(SecurityName), 'nl'); result3 = compare(strip(Name), strip(SecurityName), 'n'); result4 = compare(strip(Name), strip(SecurityName));
When i run
proc options option=encoding; run;
I got : ENCODING=WHEBREW
So what i should do?
Thank you
Please see original post, i added full information, dataset and excel. Thank you
@AlexeyS wrote:
Please see original post, i added full information, dataset and excel. Thank you
@AlexeyS Thanks! That helped.
Add the following to your code and things will start working:
Securityname=kcompress(Securityname,'00'x);
For some reason your variable SecurityName is padded with hex 00 and though the strings in the two variables are actually always different.
data check;
set test.test;
l1=lengthn(name);
l2=lengthn(Securityname);
h1=put(name,hex72.);
h2=put(Securityname,hex72.);
run;
proc print data=check;
run;
The Excel you've provided is not the source with which you've created the test SAS file. For this reason I couldn't further investigate what might have caused these trailing hex 00 values.
libname test xlsx '....';
data check;
set test.test;
l2=lengthn(Securityname);
h2=put(Securityname,hex72.);
run;
proc print data=check;
run;
First of all Thank you very much.
I want to ask you a question, Is there any way to identify these problems in variable and which way?
Thank you
@AlexeyS wrote:
First of all Thank you very much.
I want to ask you a question, Is there any way to identify these problems in variable and which way?
Thank you
You could search for non-printable characters and then list the hex values if found any. Below some sample code.
libname test '...';
data check;
set test.test;
length _row_num 8 _varname $32 _non_print_chars $100 _hex_non_print_chars $200;
array _cvars {*} _character_;
do _i=1 to dim(_cvars);
_non_print_chars=kcompress(_cvars[_i],,'w');
_len=klength(_non_print_chars);
if _len>0 then
do;
_fmt=cats('hex',2*_len,'.');
_hex_non_print_chars=putc(_non_print_chars,_fmt);
_varname=vname(_cvars[_i]);
_row_num=_n_;
output;
end;
end;
keep _row_num _varname _hex_non_print_chars;
rename _row_num=row_num _varname=varname _hex_non_print_chars=hex_non_print_chars;
run;
proc print data=check;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.