BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AlexeyS
Pyrite | Level 9

Hi,

I have to compare two variables that both of them contain values in Hebrew : Name and SecurityName. I my opinion Name variable is ok. Because when i create the values by myself and run compare function versus Name variable i get result value 0. The problem with SecurityName variable that comes from excel that i attached. Also i attached dataset.

When i use compare function, the result i got is not correct. Maybe the problem that i read file from xlsx file and proc import doesn't support encoding and  in some forums i read that you cannot control the encoding of XLSX format file.

 

The first and last observations are identical, but the result is not correct.

When i tried to create the same data and values by myself and run the same compare function i got the right result, i.e maybe something in the encoding or in excel files i read.

I  read xlsx files using proc import through call execute.

 

I tried kcompare also but the same results

result = compare(Name, SecurityName);

 

Capture.PNG

 

So what i should do? 

Thank you

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@AlexeyS wrote:

Please  see original post, i added full information, dataset and excel. Thank you


@AlexeyS Thanks! That helped.

Add the following to your code and things will start working:

Securityname=kcompress(Securityname,'00'x);

For some reason your variable SecurityName is padded with hex 00 and though the strings in the two variables are actually always different.

data check;
  set test.test;
  l1=lengthn(name);
  l2=lengthn(Securityname);
  
  h1=put(name,hex72.);
  h2=put(Securityname,hex72.);
run;
proc print data=check;
run;

Patrick_0-1640913593517.png

 

The Excel you've provided is not the source with which you've created the test SAS file. For this reason I couldn't further investigate what might have caused these trailing hex 00 values.

libname test xlsx '....';
data check;
  set test.test;
  l2=lengthn(Securityname);
  
  h2=put(Securityname,hex72.);
run;
proc print data=check;
run;

Patrick_0-1640913938422.png

 

View solution in original post

11 REPLIES 11
LeonidBatkhan
Lapis Lazuli | Level 10

Hi AlexeyS,

Could you please provide a sample of values and result you get. Also, could you verify that you are running SAS session with UTF-8 encoding?

AlexeyS
Pyrite | Level 9

Hi,

The first and last observations are identical, but the result is not correct.

When i tried to create the same data and values by myself and run the same compare function i got the right result, i.e maybe something in the encoding or in excel files i read.

I  read xlsx files using proc import through call execute.

 

I tried kcompare also but the same results

result0 = compare(Name, SecurityName);
result1 = compare(strip(Name), strip(SecurityName), 'ln');
result2 = compare(strip(Name), strip(SecurityName), 'nl');
result3 = compare(strip(Name), strip(SecurityName), 'n');
result4 = compare(strip(Name), strip(SecurityName));

 

Capture.PNG

 

When i run 

proc options  option=encoding;
run;

I got : ENCODING=WHEBREW

 

So what i should do? 

Thank you

Ksharp
Super User
Did you try KCOMPARE() function ?
AlexeyS
Pyrite | Level 9

yes i tried.

I uploaded some data and results when i run the code. Thank you

Patrick
Opal | Level 21

Can you please provide your data via a text file (attachment) and not just a screen-shot? This would allow us to actually replicate and then resolve what you observe.

....actually: Provide the SAS table and ideally also the Excel

AlexeyS
Pyrite | Level 9

Yes of course. Please  see original post, i added full information, dataset and excel. Thank you

AlexeyS
Pyrite | Level 9

Hi,

The first and last observations are identical, but the result is not correct.

When i tried to create the same data and values by myself and run the same compare function i got the right result, i.e maybe something in the encoding or in excel files i read.

I  read xlsx files using proc import through call execute.

 

I tried kcompare also but the same results

result0 = compare(Name, SecurityName);
result1 = compare(strip(Name), strip(SecurityName), 'ln');
result2 = compare(strip(Name), strip(SecurityName), 'nl');
result3 = compare(strip(Name), strip(SecurityName), 'n');
result4 = compare(strip(Name), strip(SecurityName));

 

AlexeyS_0-1640805864219.png

 

 

When i run 

proc options  option=encoding;
run;

I got : ENCODING=WHEBREW

 

So what i should do? 

Thank you



 

AlexeyS
Pyrite | Level 9

Please  see original post, i added full information, dataset and excel. Thank you

Patrick
Opal | Level 21

@AlexeyS wrote:

Please  see original post, i added full information, dataset and excel. Thank you


@AlexeyS Thanks! That helped.

Add the following to your code and things will start working:

Securityname=kcompress(Securityname,'00'x);

For some reason your variable SecurityName is padded with hex 00 and though the strings in the two variables are actually always different.

data check;
  set test.test;
  l1=lengthn(name);
  l2=lengthn(Securityname);
  
  h1=put(name,hex72.);
  h2=put(Securityname,hex72.);
run;
proc print data=check;
run;

Patrick_0-1640913593517.png

 

The Excel you've provided is not the source with which you've created the test SAS file. For this reason I couldn't further investigate what might have caused these trailing hex 00 values.

libname test xlsx '....';
data check;
  set test.test;
  l2=lengthn(Securityname);
  
  h2=put(Securityname,hex72.);
run;
proc print data=check;
run;

Patrick_0-1640913938422.png

 

AlexeyS
Pyrite | Level 9

First of all Thank you very much. 

I want to ask you a question, Is there any way to identify these problems in variable and which way?

Thank you 

 

Patrick
Opal | Level 21

@AlexeyS wrote:

First of all Thank you very much. 

I want to ask you a question, Is there any way to identify these problems in variable and which way?

Thank you 

 


You could search for non-printable characters and then list the hex values if found any. Below some sample code.

libname test '...';
data check;
  set test.test;
  length _row_num 8 _varname $32 _non_print_chars $100 _hex_non_print_chars $200;
  array _cvars {*} _character_;
  do _i=1 to dim(_cvars);
  	_non_print_chars=kcompress(_cvars[_i],,'w');
    _len=klength(_non_print_chars);
	if _len>0 then
      do;
        _fmt=cats('hex',2*_len,'.');
		_hex_non_print_chars=putc(_non_print_chars,_fmt);
		_varname=vname(_cvars[_i]);
        _row_num=_n_;
		output;
	  end;
  end;
  keep _row_num _varname _hex_non_print_chars;
  rename _row_num=row_num _varname=varname _hex_non_print_chars=hex_non_print_chars;
run;
proc print data=check;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1253 views
  • 2 likes
  • 4 in conversation