BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AlexeyS
Pyrite | Level 9

Hi,

I have to compare two variables that both of them contain values in Hebrew : Name and SecurityName. I my opinion Name variable is ok. Because when i create the values by myself and run compare function versus Name variable i get result value 0. The problem with SecurityName variable that comes from excel that i attached. Also i attached dataset.

When i use compare function, the result i got is not correct. Maybe the problem that i read file from xlsx file and proc import doesn't support encoding and  in some forums i read that you cannot control the encoding of XLSX format file.

 

The first and last observations are identical, but the result is not correct.

When i tried to create the same data and values by myself and run the same compare function i got the right result, i.e maybe something in the encoding or in excel files i read.

I  read xlsx files using proc import through call execute.

 

I tried kcompare also but the same results

result = compare(Name, SecurityName);

 

Capture.PNG

 

So what i should do? 

Thank you

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@AlexeyS wrote:

Please  see original post, i added full information, dataset and excel. Thank you


@AlexeyS Thanks! That helped.

Add the following to your code and things will start working:

Securityname=kcompress(Securityname,'00'x);

For some reason your variable SecurityName is padded with hex 00 and though the strings in the two variables are actually always different.

data check;
  set test.test;
  l1=lengthn(name);
  l2=lengthn(Securityname);
  
  h1=put(name,hex72.);
  h2=put(Securityname,hex72.);
run;
proc print data=check;
run;

Patrick_0-1640913593517.png

 

The Excel you've provided is not the source with which you've created the test SAS file. For this reason I couldn't further investigate what might have caused these trailing hex 00 values.

libname test xlsx '....';
data check;
  set test.test;
  l2=lengthn(Securityname);
  
  h2=put(Securityname,hex72.);
run;
proc print data=check;
run;

Patrick_0-1640913938422.png

 

View solution in original post

11 REPLIES 11
LeonidBatkhan
Lapis Lazuli | Level 10

Hi AlexeyS,

Could you please provide a sample of values and result you get. Also, could you verify that you are running SAS session with UTF-8 encoding?

AlexeyS
Pyrite | Level 9

Hi,

The first and last observations are identical, but the result is not correct.

When i tried to create the same data and values by myself and run the same compare function i got the right result, i.e maybe something in the encoding or in excel files i read.

I  read xlsx files using proc import through call execute.

 

I tried kcompare also but the same results

result0 = compare(Name, SecurityName);
result1 = compare(strip(Name), strip(SecurityName), 'ln');
result2 = compare(strip(Name), strip(SecurityName), 'nl');
result3 = compare(strip(Name), strip(SecurityName), 'n');
result4 = compare(strip(Name), strip(SecurityName));

 

Capture.PNG

 

When i run 

proc options  option=encoding;
run;

I got : ENCODING=WHEBREW

 

So what i should do? 

Thank you

Ksharp
Super User
Did you try KCOMPARE() function ?
AlexeyS
Pyrite | Level 9

yes i tried.

I uploaded some data and results when i run the code. Thank you

Patrick
Opal | Level 21

Can you please provide your data via a text file (attachment) and not just a screen-shot? This would allow us to actually replicate and then resolve what you observe.

....actually: Provide the SAS table and ideally also the Excel

AlexeyS
Pyrite | Level 9

Yes of course. Please  see original post, i added full information, dataset and excel. Thank you

AlexeyS
Pyrite | Level 9

Hi,

The first and last observations are identical, but the result is not correct.

When i tried to create the same data and values by myself and run the same compare function i got the right result, i.e maybe something in the encoding or in excel files i read.

I  read xlsx files using proc import through call execute.

 

I tried kcompare also but the same results

result0 = compare(Name, SecurityName);
result1 = compare(strip(Name), strip(SecurityName), 'ln');
result2 = compare(strip(Name), strip(SecurityName), 'nl');
result3 = compare(strip(Name), strip(SecurityName), 'n');
result4 = compare(strip(Name), strip(SecurityName));

 

AlexeyS_0-1640805864219.png

 

 

When i run 

proc options  option=encoding;
run;

I got : ENCODING=WHEBREW

 

So what i should do? 

Thank you



 

AlexeyS
Pyrite | Level 9

Please  see original post, i added full information, dataset and excel. Thank you

Patrick
Opal | Level 21

@AlexeyS wrote:

Please  see original post, i added full information, dataset and excel. Thank you


@AlexeyS Thanks! That helped.

Add the following to your code and things will start working:

Securityname=kcompress(Securityname,'00'x);

For some reason your variable SecurityName is padded with hex 00 and though the strings in the two variables are actually always different.

data check;
  set test.test;
  l1=lengthn(name);
  l2=lengthn(Securityname);
  
  h1=put(name,hex72.);
  h2=put(Securityname,hex72.);
run;
proc print data=check;
run;

Patrick_0-1640913593517.png

 

The Excel you've provided is not the source with which you've created the test SAS file. For this reason I couldn't further investigate what might have caused these trailing hex 00 values.

libname test xlsx '....';
data check;
  set test.test;
  l2=lengthn(Securityname);
  
  h2=put(Securityname,hex72.);
run;
proc print data=check;
run;

Patrick_0-1640913938422.png

 

AlexeyS
Pyrite | Level 9

First of all Thank you very much. 

I want to ask you a question, Is there any way to identify these problems in variable and which way?

Thank you 

 

Patrick
Opal | Level 21

@AlexeyS wrote:

First of all Thank you very much. 

I want to ask you a question, Is there any way to identify these problems in variable and which way?

Thank you 

 


You could search for non-printable characters and then list the hex values if found any. Below some sample code.

libname test '...';
data check;
  set test.test;
  length _row_num 8 _varname $32 _non_print_chars $100 _hex_non_print_chars $200;
  array _cvars {*} _character_;
  do _i=1 to dim(_cvars);
  	_non_print_chars=kcompress(_cvars[_i],,'w');
    _len=klength(_non_print_chars);
	if _len>0 then
      do;
        _fmt=cats('hex',2*_len,'.');
		_hex_non_print_chars=putc(_non_print_chars,_fmt);
		_varname=vname(_cvars[_i]);
        _row_num=_n_;
		output;
	  end;
  end;
  keep _row_num _varname _hex_non_print_chars;
  rename _row_num=row_num _varname=varname _hex_non_print_chars=hex_non_print_chars;
run;
proc print data=check;
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 3607 views
  • 2 likes
  • 4 in conversation