BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.

Hi all :

There is a one variable got "±" and I tried using these code, it doesn't work.

------------------------------------

Variable X had multiple records:

a fox is running

a cat has ±1 years to live

an apple 

------------------------------

thank you,

purple

 

/*Get a list of SAS byte for special character*/
data _null_;
do k=1 to 255;
x=byte(k);
put k +10 x;
end;
run;

data need;
set have;
 _a=tranwrd(a,byte(177),'Plus or Minus');
run;

the log 
ERROR: Some character data was lost during transcoding in the
dataset have. Either the data contains characters that
are not representable in the new encoding or truncation
occurred during transcoding.

1 ACCEPTED SOLUTION

Accepted Solutions
purpleclothlady
Pyrite | Level 9

hi @Tom @PaigeMiller :

Thank you again for your support, this is really a great place to ask for SAS help.

👍

purple

 

 

/*------------------------Migrating Data from WLATIN1 to UTF-8------------------------------------
                                   REFERENCE:
 https://documentation.sas.com/doc/en/pgmsascdc/v_006/viyadatamig/p1eedruqfsgqqcn1pmjof4br5xvt.htm
/*------------------------------------------------------------------------------------------------*/
/*===Step1: Find out WHICH Encoding is on Your SAS session
    - Mine is: SAS V9.4===*/
proc options option=encoding;
run;

/*====Step2: Use ENCODING= option to transform WLATIN1 TO UTF-8 ===*/
 data need; 
 set have (encoding=wlatin1);
 _newVar=pahdx;

  if index(compress(_newvar),"1year")>0;
	/*Recode ±*/
	_new=tranwrd(_newvar,"≥", "±");
 keep pahdx _new:;
 run;

proc print data=need;
run;

proc print output screen shot from SAS window -Result: 

 

purpleclothlady_0-1690389438949.png

 

 

 

View solution in original post

17 REPLIES 17
Tom
Super User Tom
Super User

As your message indicates what bytes to store into your variable to make it print a particular glyph depends on what ENCODING you are using.

 

Normal ASCII codes only cover the characters between a space ('20'x or 32 decimal) and a tilda ('7E'x or 126 decimal).

 

In general it would be much easier and reproducible to just store the two character string '+-' or perhaps the three character string '+/-' into the variable.

 

There is a UNICODE character for that, so if your SAS session is using UTF-8 as the encoding you could store it.

https://en.wikipedia.org/wiki/Plus%E2%80%93minus_sign

 

It should be available in LATIN1 or WLATIN1 encoding also.

https://en.wikipedia.org/wiki/Western_Latin_character_sets_(computing)

 

6    data test;
7      length latin1 $1 utf8 $2;
8      latin1='B1'x;
9      utf8=kcvt(latin1,'latin1','utf-8');
10     put (_all_) (=:$hex./);
11   run;

latin1=B1
utf8=C2B1
Tom
Super User Tom
Super User

As your message indicates what bytes to store into your variable to make it print a particular glyph depends on what ENCODING you are using.

 

Normal ASCII codes only cover the characters between a space ('20'x or 32 decimal) and a tilda ('7E'x or 126 decimal).

 

In general it would be much easier and reproducible to just store the two character string '+-' or perhaps the three character string '+/-' into the variable.

 

There is a UNICODE character for that, so if your SAS session is using UTF-8 as the encoding you could store it.

https://en.wikipedia.org/wiki/Plus%E2%80%93minus_sign

 

It should be available in LATIN1 or WLATIN1 encoding also.

https://en.wikipedia.org/wiki/Western_Latin_character_sets_(computing)

 

6    data test;
7      length latin1 $1 utf8 $2;
8      latin1='B1'x;
9      utf8=kcvt(latin1,'latin1','utf-8');
10     put (_all_) (=:$hex./);
11   run;

latin1=B1
utf8=C2B1
purpleclothlady
Pyrite | Level 9

HI Tom:

thank you for helping.

 

I am using SAS V9.4 M7, and I don't know which ENCODING i am using.

How may I know that? 

If we know which encoding should be used, can u just use this code (with the correct ENCODING) to read in SAS dataset

 

thanks again.

Purple

Tom
Super User Tom
Super User

Check the system encoding of your current SAS session.

%put %sysfunc(getoption(encoding));

If you are reading from a SAS dataset then check the encoding of the dataset using PROC CONTENTS.

proc contents data=mylib.mydataset ;
run;

How did you start SAS?

Did you type a command at the command line?  Do you have a separated command or an option on the command you are using to pick the session encoding?
Did you click on some Windows ICON?  Which one?  On Windows you should normally have a option to launch SAS with Unicode support.

Are you using from front-end tool to submit your SAS code?  Such as Enterprise Guide or SAS/Studio?  If so then make sure you connect to an SAS application server that is using UTF-8 encoding.

 

purpleclothlady
Pyrite | Level 9

hi Tom:

here is using the PROC CONTENTS-- UTF-8 Unicode(UTF-8)

purpleclothlady_0-1690301581638.png

if that is the case, I should use UTF 

UTF-8 Encoding: 0xC2 0xB1

to read in SAS dataset, is this correct?

 

thanks again

purple

 

 

PaigeMiller
Diamond | Level 26

I'm surprised you get the error that you show us with the code that you show us. Seems impossible.

 

No HAVE data set is created. No variable named A is created. Please, from now on, show us the EXACT code you are running, and not some close approximation.

 

This works:

 

data _null_;
do k=1 to 255;
x=byte(k);
put k +10 x;
end;
run;

data need;
set have;
 _a=tranwrd(x,byte(177),'Plus or Minus');
run;

 

--
Paige Miller
purpleclothlady
Pyrite | Level 9

@PaigeMiller  and @Tom :

thanks for helping. it works but didn't give the result I need.

 

purpleclothlady_0-1690304797982.png

 

I tried to search and tried this code, still got log error message.


data test (encoding=UTF8);
set myfile.test;
run;

 

 

 

Tom
Super User Tom
Super User

You seem to have attached pictures from the SAS Display Manager user interface.

 

To see if the character is working properly please print it to an ODS destination, like an HTML output and look at it there.  SAS has not updated Display Manager to really support extended characters.

 

You also need to check what encoding your SAS session is using.  If it is use LATIN1 or WLATIN1 then you should see the 'C2B1'x character string in the SAS datasets transcoded into the 'B1'x character string while SAS is using the data.

 

To see what characters are there print the values with the $HEX format so that each character is printed used two hexadecimal digits.

PaigeMiller
Diamond | Level 26

If the problem occurs when you are using the ± symbol, why are you showing us the data set for other characters????

--
Paige Miller
purpleclothlady
Pyrite | Level 9

hi @Tom @PaigeMiller :

Thank you again for your support, this is really a great place to ask for SAS help.

👍

purple

 

 

/*------------------------Migrating Data from WLATIN1 to UTF-8------------------------------------
                                   REFERENCE:
 https://documentation.sas.com/doc/en/pgmsascdc/v_006/viyadatamig/p1eedruqfsgqqcn1pmjof4br5xvt.htm
/*------------------------------------------------------------------------------------------------*/
/*===Step1: Find out WHICH Encoding is on Your SAS session
    - Mine is: SAS V9.4===*/
proc options option=encoding;
run;

/*====Step2: Use ENCODING= option to transform WLATIN1 TO UTF-8 ===*/
 data need; 
 set have (encoding=wlatin1);
 _newVar=pahdx;

  if index(compress(_newvar),"1year")>0;
	/*Recode ±*/
	_new=tranwrd(_newvar,"≥", "±");
 keep pahdx _new:;
 run;

proc print data=need;
run;

proc print output screen shot from SAS window -Result: 

 

purpleclothlady_0-1690389438949.png

 

 

 

Tom
Super User Tom
Super User

Do not put literal non-7bit ASCII characters into your CODE.  That is just asking for problems.

Use the hex code literals instead.

'B1'x should be the code for that plus minus character.

 

If your actual file has 3 bytes for the plus minus then perhaps it is not really UTF-8?  Or perhaps the creator of the dataset had already had their own troubles with encoding.

purpleclothlady
Pyrite | Level 9

hi Tom:

The data came from 3rd party -Medidata. so I don't know what happended to the entry.

 

I don't understand what does this mean-

"Do not put literal non-7bit ASCII characters into your CODE.  That is just asking for problems.

Use the hex code literals instead.     

 'B1'x should be the code for that plus minus character."

 

 

Do I need to code like this in SAS window 10 enhanced editor? 

 

thank you ,

Purple

 data need; 
 set cldata.mh(encoding=wlatin1);
 _newVar=pahdx;

  if index(compress(_newvar),"1year")>0;

	/*Recode ±*/
	/*_new=tranwrd(_newvar,"≥", "±");*/
    /*???hexcode???- Is this part you are talking about*/	
      _new=tranwrd(_newvar,"B1x", "±");
      keep pahdx _new:;
 run;

 

 

 

Tom
Super User Tom
Super User

No.

Instead of typing/copying the characters into your code replace it with CODE that will create the character.

 

So instead of typing 

'±'

use

 'B1'x

They will both create the same thing  when you are using LATIN1 encoding.

 

But if you ran the first one in a SAS session using UTF-8 encoding you would instead get the two byte string 'C2B1'x instead.

 

So instead of 

new=tranwrd(_newvar,"≥", "±");

use 

new=transwrd(_newvar,'E289A5'x,'B1'x);

Now your code file does not include those strange characters and so you will be able to work with it more easily and avoid confusion.

 

For example the string you are converting 'E289A5'x is the how to represent the greater than or equal to symbol and not the plus or minus symbol.

https://op.europa.eu/en/web/eu-vocabularies/formex/physical-specifications/character-encoding/mathem...

Which does not exist in the LATIN1 encoding.

 

So your code should be:

new=transwrd(_newvar,'E289A5'x,'>=');

  

 

You still did not say what encoding your SAS session is using, just which version of SAS you were using.

 

 

purpleclothlady
Pyrite | Level 9
@Tom:
Super!
tested and it worked.

One typo 🙂
new=tranwrd(newvar, 'E289A5'x, '>=');

I did mention in above post. but now paste it here.
the SAS VERSION I am using is :
NOTE: SAS (r) Proprietary Software 9.4 (TS1M7)
thank you,
Purple

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 2707 views
  • 4 likes
  • 3 in conversation