DATA Step, Macro, Functions and more

Import an XML file - complex structure

Reply
Super User
Posts: 23,247

Import an XML file - complex structure

I've tried the XML Map (XMLV2) but sadly don't have the XML mapper installed.

 

What I've tried:

 

filename sample "OPEN_DATA_1.xml";
filename mapfile "test.map";
libname sample xmlv2 xmlmap=mapfile automap=replace;

proc copy in=sample out=work;
run;

And the log:

 

1200 filename sample "OPEN_DATA_1.xml";
1201 filename mapfile "test.map";
1202 libname sample xmlv2 xmlmap=mapfile automap=replace;
ERROR: The creation of the XML Mapper file failed.
ERROR: Error in the LIBNAME statement.
1203
1204 proc copy in=sample out=work;
1205 run;

ERROR: Libref SAMPLE is not assigned.
NOTE: Statements not processed because of errors noted above.
NOTE: PROCEDURE COPY used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds

NOTE: Parsing with high validation.
ERROR: Unable to parse the specified file:
fileSmiley SurprisedPEN_DATA_1.xml
Exception class: java.lang.OutOfMemoryError
NOTE: The SAS System stopped processing this step because of
errors.

 

Any help would be very much appreciated in reading this file. I've attached the file as well but it can be downloaded from here:

http://open.canada.ca/data/en/dataset/0032ce54-c5dd-4b66-99a0-320a7b5e99f2

 

I'd be fine with scraping the API as well, which returns JSON.

 

Sadly I'm on SAS 9.4M3 which doesn't have the JSON lib yet.

 

 

 Note that I've changed the extension from XML to TXT and zipped it to allow for the upload to occur.

 

Attachment
Regular Contributor
Posts: 150

Re: Import an XML file - complex structure

After a week (years really) of bloody battle with the XML libname, i am bailing with trying to get it to read/write complex XML.

 

I would suggest using an alternate tool for the XML/JSON such as C# (my choice)/PowerShell/Python. Spend time getting the data formed then bring it into SAS.

 

Your alternate is to parse it. Poor alternative when there are engines that can handle it.

 

Super User
Posts: 10,686

Re: Import an XML file - complex structure

Could you use SAS XML Mapper to get the xml map and import it again ?


Super User
Posts: 10,686

Re: Import an XML file - complex structure

Could post it as JSON file. Maybe I could write some data step code to get it.

And also you can use SAS University Edition to parse JSON .

Super User
Posts: 23,247

Re: Import an XML file - complex structure

I can't use SAS UE because it's for work Smiley Sad

 

@AlanC I did parse it manually, it took two hours AFTER I spend about 4 fighting with R/SAS and getting this file in...I think I'll try Python instead now. I still want it automated because I have to read about 39 of these every week....

 

@Ksharp The JSON is via an API, here's the call:

https://www.ic.gc.ca/app/scr/cc/CorporationsCanada/api/corporations/1007.json?lang=eng

It will return a JSON file. 

 

 

Regular Contributor
Posts: 150

Re: Import an XML file - complex structure

What O/S is your laptop? I can make a run at it later this weekend (flying today) but would prefer c# to do it. I can make an exercise standalone but maintenance would be on you
Regular Contributor
Posts: 150

Re: Import an XML file - complex structure

Keep in mind also, the XML Mapper is a free download from SAS. I am not even sure SAS is required on the machine.
Super User
Posts: 23,247

Re: Import an XML file - complex structure

@AlanC No install privileges on these machines. I'll give it a shot though, thanks for the suggestion! 

Regular Contributor
Posts: 150

Re: Import an XML file - complex structure

Reeza,

 

I wrote a C# program that parses all of these XML files and turns them into tab-delimited files with separate files per XML file. I have zipped it up and put it here. Be patient if you run it because the XML is very layered so the parse takes time:

 

https://1drv.ms/u/s!Ajc9gcr7VcqLo68J3nueWo-b81vQlg

 

The exe takes 2 args at the command-line:

 

The source directory where the XML is located and the ouput location for the unwound files. here is an example:

 

"X:\temp\XMLTest" "x:\temp\XMLTest\Out"

 

You will find the exe in the bin directory under debug. 

 

To provide a sense, it took me around 20 mins to code this. I would suggest if you want further control, you load the data into an XDocument and then XPath what you need out. I am happy to help you do the coding if desired. I will be onsite this week so will have limited time but will respond in the evening if this is the direction you wish to go. Personally, I would suggest writing these to a SQL Express database instead of delimited but delimited illustrates the issue.

 

Good luck. If you need help on the JSON side, ping me.

 

Alan

Super User
Posts: 23,247

Re: Import an XML file - complex structure

@AlanC wow! Thanks, so much! I'll have to look at this tomorrow and see what I can run on my desktop, looking forward to it and appreciate your help a ton!

Regular Contributor
Posts: 150

Re: Import an XML file - complex structure

Since the link may disappear at some point, I am uploading it here.

 

Also, this works with any XML: it is not specific to your source. It may not be exactly what is needed so let me know of any issues.

Attachment
Ask a Question
Discussion stats
  • 10 replies
  • 295 views
  • 3 likes
  • 3 in conversation