SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Efficient comparing XML files generated by proc metadata.

Regular Contributor
Posts: 161

Efficient comparing XML files generated by proc metadata.


Lets start from begining , few weeks ago I've finished first version of task related with comparing ETLs that stored on diff metaservers(dev, test, production etc.).

Very shortly about implementation, so at begining I created "IN" xml file that will be input for "proc metadata".
This "IN" xml file is just use <GetMetadataObject> request with included subtypes flag(and few another flags -omi_tempate,omi_xmlselect etc, etc.), so shortly this file describe what metaobjects I wont to extract.

After proc metadata executes with this input XML file it returns output XML file with all needed information regarding Job(Transformations, transforms options, mapping etc.).

As I wrote before proc metadata executes on two different metaservers so I have two output XML files from this two metaserves.

To compare these files I read them into sas table using infile statement, and then compare these sas tables, here I just need the fact - if they are equal or different.

Obviously if Jobs are the same- these XML files are also same(except IDs that are different on diff metaservers, but it's not a problem I just delete this ids from sas table columns).

But there are some case when ETLs are absolutely equal,but output XML files are different.

To reproduce this situation I just , for example, change format in some ETLs transformations column, save the change and then rollback this change.

After this roll back ETLs will be factually the same but order of transformation ("TransformationStep" teg) changes  -this transformation that were changed(but then change ralled back)  now is last in the "transformationSteps" tegs.

All metaobjects that I need to compare actually are under "TransformationStep" metaobject so obviously I can separate each TransformationStep teg in separate XML file and then compare  all these smaller files, but I think probably exists some better solution.

For quering these source XML files I use XML map engine with XPATH, so SAS provide some tools that can read and query these XML files, so may be there is also some additional possibility of "smart and quick" comparing these XML files that will handle issue with changed order of tegs...

I use SAS 9.1.3 so some of newest features isn't available in my case.


Ask a Question
Discussion stats
  • 0 replies
  • 1 in conversation