BookmarkSubscribeRSS Feed
alepage
Barite | Level 11

Hello,

 

I am using a macro funtion to download an xml file from a web site. It was working properly but since few days, we have the following errors. How do we solve that kind of issue?

 

ORDERDATE=20240701
ORDERTIME=080000
SURVEYNAME=BROKERS_QUEBEC_SURVEYS_RESPONSES
" BROKERS_QUEBEC_SURVEYS_RESPONSES"
'you are an authorized user in stha8n09z'
SYS_PROCHTTP_STATUS_CODE=200
{
"access_token": "51ebfa1c-7c45-4305-9658-9fa0008d91b3",
"token_type": "Bearer",
"expires_in": 3599,
"scope": "manage:all"
}
BEARERTOKEN=...0008d91b3 BTEXPIN= 3599 BEARERTIMER_START=2035560918.51997
"==================================================================="
SURVEYID=SV_0vnXEeLgN8B46bA
ZNAME=Zfile13
"===========WE ARE STARTING TO PROCESS SURVEY SV_0vnXEeLgN8B46bA ==========="
"========================== PROGRESSID=ES_aVJxgqv3peSjDkG =========================="
PSTATUS=inProgress
PSTATUS=inProgress
PSTATUS=inProgress
PSTATUS=complete
FILEID=c33612d3-e2dd-45d1-bef4-aeee06e12397-def
Archive: /finsys/.../data/Zfile13.zip
inflating: /finsys/.../data/Virage Survey.xml
XMLFILE=Virage Survey.xml
XMLFILE2=VirageSurvey.xml
XMLMAP2=VirageSurvey.map
FILE_PATH=/finsys/.../data
ERROR: Some code points did not transcode.
occurred at or near line 160740, column 109
ERROR: XML parsing error. Please verify that the XML content is well-formed.
ERROR: File WORK.Response.DATA has not been saved because copy could not be completed.
ERROR: Some code points did not transcode.
occurred at or near line 160740, column 109
ERROR: XML parsing error. Please verify that the XML content is well-formed.
ERROR: File WORK.Responses.DATA has not been saved because copy could not be completed.

 

 

 

Partial code inside a macro function:

filename oscmd pipe "unzip -d &file_path. -jo &file_path./&Zname..zip";
     
data _null_;
	infile oscmd;
    input;
    put _infile_;
run;

/******* Get the xml file name********/

Filename adb_xml pipe "ls -Art &file_path./*.xml | tail -n 1 ";
 
DATA xml_filelist;
Infile adb_xml truncover;
Input 	infile_name $100.;
Filename=scan(infile_name,-1,"/","b");
call symput ('xmlFile',strip(Filename));
call symput ('cn_xmlFile1',strip(infile_name));
RUN;
  
/** Declaring other macro variables
    Please don't move those macro. They need the value of xmlFile from the
    above call symput statement                                           ***/


%let xmlFile2=%sysfunc(compress(&xmlFile));
%let xmlMap2=%substr(&xmlFile2.,1,%length(&xmlFile2.)-4).map;
%put &=xmlFile;
%put &=xmlFile2;
%put &=xmlMap2;
%put &=file_path.;

/** Renaming the XML File with SAS standard (no space in the file name) ***/

filename oscmd pipe "mv ""&file_path./&xmlFile."" ""&file_path./&xmlFile2."" 2>&1";

data _null_;
infile oscmd;
input;
put _infile_;
run;
  
/********** Creating the map file ***************/

filename datafile "&file_path./&xmlFile2.";
filename mapfile "&file_path./&xmlMap2.";

libname datafile xmlv2 xmlmap=mapfile automap=replace;



proc copy in=datafile out=work;
run;

%goto exit;

Errors appears when the proc copy is executed
12 REPLIES 12
ChrisNZ
Tourmaline | Level 20

The message is clear:

ERROR: XML parsing error. Please verify that the XML content is well-formed.
The file is not as expected.

 

Did you look at the data near line 160740, column 109 ?

alepage
Barite | Level 11
Yes, I have look at obs 160740 but I don't think that there is a column 109. I am able to open the file in Excel and I dont see nothing particular. Is there an option or a way to solve that issue.
ChrisNZ
Tourmaline | Level 20

If the XML file is malformed, you need to tell the provider to correct it.

 

The other option is to correct it manually (by hand or by program, add a column or remove the record), but you shouldn't have to do that. Fix the process rather than the error.

Tom
Super User Tom
Super User

@alepage wrote:
Yes, I have look at obs 160740 but I don't think that there is a column 109. I am able to open the file in Excel and I dont see nothing particular. Is there an option or a way to solve that issue.

Not sure how opening a TEXT file with Excel is going to help you look at what is in like 160,740.

 

Try looking at the file in a text editor.  Or just use a SAS data step and see if you see anything strange in column 109 on that line.

data _null_;
  infile datafile firstobs=160739 obs=160741 ;
  input;
  list;
run;

Is your SAS session running with ENCODING option set to UTF-8?

If not try running the same program in a SAS session that is.

alepage
Barite | Level 11

Tom,

 

You were right about using Notepad ++ instead of Excel. Here's what I have into observations 160739 and 160740

 

<QID2_TEXT>La dame a été très patience et gentille et que ça fait longtemps que je suis assurée avec vous.
Si c’était possible, je me demandais s’il n’y aurais pas possibilité de diminuer le coût de l’assurance? 🙂</QID2_TEXT>

 

First, we see the we have an Emoji into the XML file. How can we handle that issue?

Second, As you can see, the language used is French and we have accentued characters .  Which encoding will be more appropriate for that? UTF-8 or another one

 

Tom
Super User Tom
Super User

Running you SAS session with ENCODING='UTF-8' should work. 

Run this line to see what encoding you are using:

%put %sysfunc(getoption(encoding));

The system encoding is set when the SAS session starts.  So if your SAS session is not using it now you will need to start SAS in a different way to get it set.

 

The only reason it would not would be if the actual byte string in the file is not really a valid UTF-8 character.  

 

If running with UTF-8 does not work then you should add an extra step in your process to preprocess the file (using encoding=ANY or RECFM=F so that no transcoding is attempted) and replace that set of bytes with something that is valid.

 

So you need to know the bytes that are in the file at the place that Notepad++ is showing the smiley face.  Which is hard to tell since you posted the value as glyphs instead of the hexcodes.   And since you pasted it into the body of your message the forum editor has replaced it with html code for the smiling face.

 

Did you try running the data step to see what SAS reads?  Can you see the hexcode(s) for that character?  If the LIST command did not display the hex codes then try reading the line into a character variable and printing it with $HEX format.

 

 

Ksharp
Super User
Make sure the encoding of XML file is the same with your sas session.
Open this XML file by NOTEPAD++ and check its encoding. and specify the right encoding in your code . like :

filename datafile "&file_path./&xmlFile2." encoding='utf8' ;
alepage
Barite | Level 11

Hello,

 

I have used notepad++ to look at the xml file and I found that I have emoji into the xml file.

ex: 

 

<QID2_TEXT>La dame a été très patience et gentille et que ça fait longtemps que je suis assurée avec vous.
Si c’était possible, je me demandais s’il n’y aurais pas possibilité de diminuer le coût de l’assurance? 🙂</QID2_TEXT>

 

I believe that the mapping step is failing due to those emoji into the xml file.

 

/********** Creating the map file ***************/

filename datafile "&file_path./&xmlFile2.";
filename mapfile "&file_path./&xmlMap2.";

libname datafile xmlv2 xmlmap=mapfile automap=replace;



proc copy in=datafile out=work;
run;

How to solve that issue? how to remove the emoji from the xml file?

Ksharp
Super User
Firstly check your encoding of XML file by Notepad++.

If it was utf-8 ,Try the option encoding= I mentioned before.
And better change your sas session encoding to be utf-8 too.

filename datafile "&file_path./&xmlFile2." encoding='utf8' ;
filename mapfile "&file_path./&xmlMap2." encoding='utf8' ;

libname datafile xmlv2 xmlmap=mapfile automap=replace;



proc copy in=datafile out=work noclone;
run;
alepage
Barite | Level 11
filename datafile "&file_path2./&xmlFile3." encoding='utf-8';
filename mapfile "&file_path2./&xmlMap2." encoding='utf-8';

libname datafile xmlv2 xmlmap=mapfile automap=replace;

proc copy in=datafile out=work noclone;
run;

ERROR: The creation of the XML Mapper file failed.
ERROR: Error in the LIBNAME statement.
ERROR: Libref DATAFILE is not assigned.

Ksharp
Super User
Remove the encoding= option from XML Mapper file ?

filename datafile "&file_path2./&xmlFile3." encoding='utf-8';
filename mapfile "&file_path2./&xmlMap2." ;

libname datafile xmlv2 xmlmap=mapfile automap=replace;

proc copy in=datafile out=work noclone;
run;
alepage
Barite | Level 11

ERROR: Some code points did not transcode.
occurred at or near line 160740, column 109
ERROR: XML parsing error. Please verify that the XML content is well-formed.
ERROR: File WORK.Response.DATA has not been saved because copy could not be completed.
ERROR: Some code points did not transcode.
occurred at or near line 160740, column 109
ERROR: XML parsing error. Please verify that the XML content is well-formed.
ERROR: File WORK.Responses.DATA has not been saved because copy could not be completed.

 

Those errors are due to the presence of emoji into the xml file

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 1535 views
  • 1 like
  • 4 in conversation