BookmarkSubscribeRSS Feed
Jeg123
Calcite | Level 5

Trying to import thousands of XML files and the ones that contain an Apostrophe (?) does not seem to work. Some example are: 

 

- á

- É

- ŕ

I read this can be solved with xmlprocess=permit. However, still same error. Any other solutions?

 

6 REPLIES 6
ballardw
Super User

"Does not seem to work" is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.

 

Where are these characters? In file names? In the path to files? Names of columns? Values in columns?

 

Those are not apostrophes. The first two are accents that provide information on how the vowel is pronounced. I don't recognize the third one but is again likely a different foreign language character.

Which brings up things like which operating system  are you using (may have impact on case and characters in file descriptors) and the language setting your SAS is using.

Jeg123
Calcite | Level 5

The error I get is:

 

ERROR: Some code points did not transcode.
occurred at or near line 41982, column 57
ERROR: XML parsing error. Please verify that the XML content is well-formed.

 

I tried reproducing it with an example, but I cant get the same error. Below is an example, it reads the data, but the characters are read wrong when I look at it in SAS. Indeed I believe it has to do with some language setting. It looks like the characters are Czech.

 

test_xml.xml:

 

<ssf:SSF>
<app>
<char_field>A</char_field>
</app>
<app>
<char_field>B</char_field>
</app>
<app>
<char_field>á</char_field>
</app>
<app>
<char_field>É</char_field>
</app>
<app>
<char_field>ŕ</char_field>
</app>
</ssf:SSF>

 

test.map:

 

<?xml version="1.0" encoding="UTF-8"?>
<SXLEMAP version="2.1" name="SXLEMAP">
<!-- ############################################################ -->
<TABLE name="app">
<TABLE-PATH syntax="XPath">/ssf:SSF/app</TABLE-PATH>

<COLUMN name="char_field"> <PATH syntax="XPath">/ssf:SSF/app/char_field</PATH> <TYPE>character</TYPE> <DATATYPE>string</DATATYPE> <LENGTH>255</LENGTH> </COLUMN>

</TABLE>


</SXLEMAP>

 

filename temp_pl 'loc/test_xml.xml';

filename map_temp 'loc/test.map';

libname temp_pl xmlv2 xmlmap = map_temp;

data test;
set temp_pl.app;
run;

 

andreas_lds
Jade | Level 19

Changing the sas session encoding to utf-8 should solve the problem. The encoding can only be changed during starting the sas session.If sas runs on a server contacting an admin is necessary.

Jeg123
Calcite | Level 5

Is there no way to specify when reading the file itself? Indeed SAS in on a server and it seems like they do not want to change the encoding. 

andreas_lds
Jade | Level 19

Can you change the file-enconding to utf-8-bom (Notepad++ can do this)? SAS should recognize it and then tries to read/convert the chars. But if those chars are not in the current codepage, SAS can't do anything to read the data properly.

Jeg123
Calcite | Level 5

I cant install Notepad++, and if I could, I still have thousands of files which would just take to much time. 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 687 views
  • 2 likes
  • 3 in conversation