BookmarkSubscribeRSS Feed
Marty
Calcite | Level 5

My company is a health care data integrator. For an upcoming engagement, we are expecting to receive electronic health record data files in HL7 CDA XML format, from multiple sources. 

Hoping to find someone with experience loading this kind of data into SAS datasets, to get information on how you have done it, and whether you have had to work around files that have incomplete/incorrect XML content.

 

We work with base SAS, version  9.4 (TS1M2), in a linux environment, and run programs in batch mode.

 

3 REPLIES 3
noling
SAS Employee

I've worked with HL7s in XML and normal linear/rectangular text form, but only the normal text forms with SAS. HL7 XMLs are are nice tall rectangular XMLs and I would think the SAS XML libname would get you pretty far.

 

One got-ya I've encountered is repeating segments, fields, and sub-components (or whatever they're called). SAS may be able to read all these in nicely, but SAS may not know that if you have 3 repeating PID fields or sub-components, that they each need to be treated and cataloged differently.

 

If you parse HL7s manually, I've learned that it's much easier to parse each section by field then by sub-component and not just left to right through the segments. I.e. parse your first field, parse the first sub-component, parse the next sub-component, THEN parse your second field. I would recommend NOT trying to say "ok - characters 1-3 is the segment name, the first sub-component of the first field is x, the second sub-component of the first field is y, etc, in that order.

 

Good luck!


Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF

View now: on-demand content for SAS users

noling
SAS Employee

Additionally, the XML should really be structured correctly since your source is likely the output of an EMR. 

 

You're receiving XMLs, but if you receive the data in the linear rectangular form, each segment should be on its own line of text. I think that's part of the HL7 standard, but don't quote me. I've seen lots of QA issues due to many segments being on one line, and trying to scan for segment names followed by "|" to break up lines. Ex: someone is named "Cupid", and that's the end of a field, so it's printed as "CUPID|<next field>". This could be incorrectly interpreted as the start of the PID segment, and cause issues as you would expect.


Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF

View now: on-demand content for SAS users

Marty
Calcite | Level 5

The HL7 documentation in the Implementation Guide for CDA has a thorough description of templates of the types of info that may be in the files. Still looking into options for existing applications that will at least transform the XML into another format that can be used more directly, but I'm starting to work on design of code to parse the files to pick up data elements myself as a backup plan.

 

Thanks for the tips on parsing!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1405 views
  • 0 likes
  • 2 in conversation