BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
js5
Pyrite | Level 9 js5
Pyrite | Level 9

Hello,

 

our server is running SAS in UTF-8 and we use EG for development. I am facing issues putting ≤ symbol into proc format:

 

proc format;
	value $avisit
		"V0" = "V0 ≤28d pre"
		"V1PRE" = "V1 pre"
		"V2" = "V2 3+2d post"
		"V3" = "V3 7+2d post"
		"V4" = "V4 28±3d post"
		"V5" = "V5 3m±7d post"
		"V6" = "V6 6m±10d post";

If I save the file, ≤ gets converted to =. If I open the saved file with Notepad++, it says that the file is ANSI encoded. If I then change the encoding to UTF-8 and fix the file up, I get this:

formatutf8.png

Can this be made to work? I guess I could read a text file and use ctnlin parameted but this seems rather excessive. Thank you for your feedback in advance.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
yabwon
Amethyst | Level 16

Before saving the file select proper encoding. In EG8 it looks like: 

yabwon_0-1685525339176.png

For EG7 it looks a bit different but is there too.

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



View solution in original post

7 REPLIES 7
svh
Lapis Lazuli | Level 10 svh
Lapis Lazuli | Level 10

I see the issue, and I don't know enough about changing encoding and if that would work.

 

I usually embed special characters in my formats with this kind of syntax. E.g., in this case the unicode value for a LE sign is 2264

 

proc format;
   value quantity 1 = 'Never'
             2 = "1(*ESC*){unicode '2264'x}5 visits"
             3 = "6(*ESC*){unicode '2264'x}10 visits"
;
js5
Pyrite | Level 9 js5
Pyrite | Level 9

This works for ODS output but not if I wish to have unicode symbols in my datasets. I have reached out to SAS support regarding this as it seems quite misleading to claim to "support" unicode (which dates back to the 90s) while requiring the code itself to be plain ASCII. I am guessing I would have similar issues if I had to refer to either variables or values containing characters not representable by ASCII.

yabwon
Amethyst | Level 16

Before saving the file select proper encoding. In EG8 it looks like: 

yabwon_0-1685525339176.png

For EG7 it looks a bit different but is there too.

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



js5
Pyrite | Level 9 js5
Pyrite | Level 9

Thanks, it worked. When compared to the manually prepared unicode file is that SAS EG saves it with byte-order mark: UTF-8-BOM as opposed to UTF-8. Can the default encoding be changed?

yabwon
Amethyst | Level 16

True, it saves it as UTF-8-BOM and it looks like there is no UTF-8-NOBOM on the list.

 

I didn't find any option in the "Tools -> Options ->" menu to set default encoding... The only thing that pops-up in my head is that maybe there is a Windows registry key to edit for that. The fist person I would ask about such possibility is @ChrisHemedinger. ( In general, Chris knows a lot about EG so he is a good point of contact 😉 )

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



js5
Pyrite | Level 9 js5
Pyrite | Level 9

According to Wikipedia, BOM should not be necessary to recognise a file as UTF-8 but many programs need it regardless [1]:


The Unicode Standard permits the BOM in UTF-8,[4] but does not require or recommend its use. [5](...) Microsoft compilers[11] and interpreters, and many pieces of software on Microsoft Windows such as Notepad (prior to Windows 10 Build 1903[12]) treat the BOM as a required magic number rather than use heuristics. These tools add a BOM when saving text as UTF-8, and cannot interpret UTF-8 unless the BOM is present or the file contains only ASCII

Setting UTF-8-BOM as default would definitely be useful as otherwise one has to actively parse the code for symbols not representable as ASCII which is not very realistic. Moreover, the option to select encoding only appers when using File -> Save as or the respective button, but not when going via Properties -> Save as, which makes it super easy to miss.

[1] https://en.wikipedia.org/wiki/Byte_order_mark

ChrisHemedinger
Community Manager

Copying this from another related discussion -- in general it's better to detect UTF-8 by examining contents and not relying on BOM. But some systems might still rely on it.

 

"Use of a BOM is neither required nor recommended for UTF-8, but may be encountered in contexts where UTF-8 data is converted from other encoding forms that use a BOM or where the BOM is used as a UTF-8 signature"
SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 3587 views
  • 1 like
  • 4 in conversation