10-05-2017 06:26 PM
I've have a lot of XML files that have "hidden" credit bureau data that I want to extract. Each XML file represents a customer of ours that contains info like address info/employer info/deal structure info/etc. The credit bureau data is "hidden" in one of the attributes of one of the root elements. This data is extremely useful to my company and I want to be able to extract it.
I tried importing the entire xml file (using XML mapper) into SAS but the credit bureau data gets truncated because it's much longer than the max characters aloud. So, I can't even read the data into SAS to try to parse that field out later on.
Can someone give me some ideas on how to solve this issue? Is there a way to read this field directly into SAS without the other surrounding data? Not sure how to go about this. Below is an example of the data I'm working with. The raw_xml data is the "hidden" data (in bold) (Often over 100,000 characters). Removed 99% of the data but just wanted to give you an idea of the structure that I'm working with.
Hope this makes sense. Appreciate any help/advice.
I use Base SAS and xmlv2 to read in the data.
10-05-2017 06:49 PM
XML is just text. Read in and search the start/end tags -> <creditBureau> </creditBureau> and only keep the content in between the two.
I think that's going to require using the _infile_ statement instead of XML mapper though.
I would have thought the XML mapper would pull that out though, since it's not really hidden, it's between tags.
10-05-2017 11:02 PM
10-06-2017 04:53 PM
. I can't only read in the raw_xml attribute between the Reports tags?
I have 15,000 xml files. I can't manually do this.
That's why you write a program. Once you have it working for one you can figure out how to do it for 15000. I don't recall seeing a 'manual' suggestion anywhere in the answers.