I've tried the XML Map (XMLV2) but sadly don't have the XML mapper installed.
What I've tried:
filename sample "OPEN_DATA_1.xml";
filename mapfile "test.map";
libname sample xmlv2 xmlmap=mapfile automap=replace;
proc copy in=sample out=work;
run;
And the log:
1200 filename sample "OPEN_DATA_1.xml";
1201 filename mapfile "test.map";
1202 libname sample xmlv2 xmlmap=mapfile automap=replace;
ERROR: The creation of the XML Mapper file failed.
ERROR: Error in the LIBNAME statement.
1203
1204 proc copy in=sample out=work;
1205 run;
ERROR: Libref SAMPLE is not assigned.
NOTE: Statements not processed because of errors noted above.
NOTE: PROCEDURE COPY used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
NOTE: Parsing with high validation.
ERROR: Unable to parse the specified file:
file:OPEN_DATA_1.xml
Exception class: java.lang.OutOfMemoryError
NOTE: The SAS System stopped processing this step because of
errors.
Any help would be very much appreciated in reading this file. I've attached the file as well but it can be downloaded from here:
http://open.canada.ca/data/en/dataset/0032ce54-c5dd-4b66-99a0-320a7b5e99f2
I'd be fine with scraping the API as well, which returns JSON.
Sadly I'm on SAS 9.4M3 which doesn't have the JSON lib yet.
Note that I've changed the extension from XML to TXT and zipped it to allow for the upload to occur.
After a week (years really) of bloody battle with the XML libname, i am bailing with trying to get it to read/write complex XML.
I would suggest using an alternate tool for the XML/JSON such as C# (my choice)/PowerShell/Python. Spend time getting the data formed then bring it into SAS.
Your alternate is to parse it. Poor alternative when there are engines that can handle it.
Could you use SAS XML Mapper to get the xml map and import it again ?
Could post it as JSON file. Maybe I could write some data step code to get it. And also you can use SAS University Edition to parse JSON .
I can't use SAS UE because it's for work 😞
@AlanC I did parse it manually, it took two hours AFTER I spend about 4 fighting with R/SAS and getting this file in...I think I'll try Python instead now. I still want it automated because I have to read about 39 of these every week....
@Ksharp The JSON is via an API, here's the call:
https://www.ic.gc.ca/app/scr/cc/CorporationsCanada/api/corporations/1007.json?lang=eng
It will return a JSON file.
@AlanC No install privileges on these machines. I'll give it a shot though, thanks for the suggestion!
Reeza,
I wrote a C# program that parses all of these XML files and turns them into tab-delimited files with separate files per XML file. I have zipped it up and put it here. Be patient if you run it because the XML is very layered so the parse takes time:
https://1drv.ms/u/s!Ajc9gcr7VcqLo68J3nueWo-b81vQlg
The exe takes 2 args at the command-line:
The source directory where the XML is located and the ouput location for the unwound files. here is an example:
"X:\temp\XMLTest" "x:\temp\XMLTest\Out"
You will find the exe in the bin directory under debug.
To provide a sense, it took me around 20 mins to code this. I would suggest if you want further control, you load the data into an XDocument and then XPath what you need out. I am happy to help you do the coding if desired. I will be onsite this week so will have limited time but will respond in the evening if this is the direction you wish to go. Personally, I would suggest writing these to a SQL Express database instead of delimited but delimited illustrates the issue.
Good luck. If you need help on the JSON side, ping me.
Alan
@AlanC wow! Thanks, so much! I'll have to look at this tomorrow and see what I can run on my desktop, looking forward to it and appreciate your help a ton!
Since the link may disappear at some point, I am uploading it here.
Also, this works with any XML: it is not specific to your source. It may not be exactly what is needed so let me know of any issues.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.