SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

XML engine renaming variables and creating empty variables

Occasional Contributor
Posts: 8

XML engine renaming variables and creating empty variables

Hi All,

I have data in XML format, and normally I have no issues parsing the data into tables using the SAS XML engine. In a recent version of an XML file, I noticed variables "date_time0" and "date_time1" being created in the SAS dataset instead of a "date_time" variable. There are no tags in the XML file corresponding to date_time0 or date_time1, only date_time. The date_time1 variable is blank, and the date_time0 variable contains the values for the date_time tag. Previously when this has occurred, the culprit was duplicated tags - it appeared that the XML engine was adding 0 and 1 as suffixes to the names to make them unique. This is not the case here - I wrote some R code to scan the XML file for duplicated tags (with great success in the past), and none were found.

Interestingly, when I use SAS XMLV2 instead of XML (or use the R XML package), it (correctly) parses the data properly, and just creates date_time (not date_time0 and date_time1). The issue with using XMLV2 is that some of the data contain tags that cause XMLV2 to halt, such as the less-than-or-equal-to or greater-than-or-equal-to symbols.

Questions for

  1. What could cause the XML engine to create the extra variables and add the suffixes when I use the XML engine? Is this some
  2. Is there a way to escape such characters using the XMLV2 engine?

Apologies for not being able to upload a reproducible example: the data is part of a clinical trial, so I can't share the XML file. If there's any other additional information I can provide, please let me know.

Any help greatly appreciated,

JFB (SAS 9.3 TS Level 1M0 x64_7PRO)

Ask a Question
Discussion stats
  • 0 replies
  • 1 in conversation