08-15-2011 09:46 AM
Page 6 in this paper shows how to pull and format XML data from twitter.
The XML library engine makes use of an XML map file to provide the rule for how to transform different XML structures into relational data tables.
As I don't have the ability to create an XML map file, is it possible to just pull the raw XML data into a data set so I can parse it using data steps?
Thanks in advance for any help.
08-15-2011 11:44 AM
And, an XML map file is just an ASCII text file that uses a particular type of XML tags to define how a second XML file should be read into a SAS dataset using the XML LIbname engine.
So, if you have a simple enough file and a straightforward map, you could just type your XML map file into Notepad. Although, the XML Mapper does make the whole process easier by generating all the map XML for you. The documentation for the SXLE (SAS XML Libname Engine has some examples of what the .map file should look like.)
But, as Art pointed out, you can freely download the XML Mapper. Once it generates the map file, then you have to use that map file to read your XML into SAS format. The XML Mapper doesn't actually submit any code for you -- so all it's doing is giving you a GUI interface to generate the map file, and it very nicely generates some sample invocation code for you, too.
08-15-2011 07:39 PM
And just to round out a complete anwer to the original question, an XML file is just a specialy structured text file, so yes - you could read it in a DATA step using INFILE and INPUT statements. Some PERL regular expressions and the SAS PRX functions would facilitate parsing your data from amidst the XML tags. But that sure would be a lot of extra work! As Cynthia and Art pointed out, the SAS XML Mapper is free, automatically generates the XML map file and even provides sample SAS code for accessing the XML data. With LIBNAME access, reading XML data is just as easy as reading a regular SAS data set.
08-19-2011 09:06 AM
Thanks for the reply Art. I didn't realise the XML mapper was free!
Would any of you know whether this mapper works with a file in JSON (Java Script Object Notation) format which is similar to XML?
I noticed Cynthia contributed to a thread on parsing a JSON file when originally looking into my problem.
SASJedi, if I did want to pull the raw data, would it just be a case of putting the url into an INFILE statement like;
INFILE 'http://.....' ;
The reason I ask is that I'm wanting to learn the ins and outs of the sas functions which deal with character string reading and editing (trim,scan,index,etc...). It would be nice to see what I can come up with compared to the output produced by the XML map. I realise it's a lot of extra work but I figured it would be as good a way as any to learn!
Thanks to you all for your help with this.
08-19-2011 10:58 AM
You might want to ask Tech Support. I thought that the XML Mapper looked for a valid XML Processing instruction at the top of the XML file that you were feeding into the mapper. If I go out to JSON.ORG, they are touting JSON as an "alternative" to XML -- http://www.json.org/xml.html
So my guess is that a JSON file won't work with the XML Mapper. The other issue that you might want to check with Tech Support is whether you can read or retrieve a JSON file using the HTTP protocol on the INFILE statement. Or, Tech Support may have some other ideas of how to read the JSON file without using the XML Mapper. The key to using the XML Mapper is that you have XML to read with it -- since JSON bills itself as NOT XML -- I just don't see how that's going to help you. If you are working with the BI Platform, it looks like the BI Web Services do have some interaction with JSON: http://support.sas.com/documentation/cdl/en/wbsvcdg/62759/HTML/default/viewer.htm#n0uxdl0ugxduw6n1gd... To me that indicates you ought to work with Tech Support before you reinvent the wheel.
Need further help from the community? Please ask a new question.