BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
djbateman
Lapis Lazuli | Level 10

I have two data sets: one with lab test results and one with lab normal values (displayed below are very small subsets for one patient).

Lab Test Results:

PATNOLAB_IDTSTTYPCDORDSEQNOTSTCDCOLL_DTRESULT
37280007-001Chemistry1Sodium7-May-12139
37280007-001Chemistry1Sodium21-May-12139
37280007-001Chemistry1Sodium1-Jun-12137
37280007-001Chemistry1Potassium7-May-123.5
37280007-001Chemistry1Potassium21-May-123.0
37280007-001Chemistry1Potassium1-Jun-123.3

Lab Normal Values:

LAB_IDTSTTYPCDORDSEQNOTSTCDLABNAMELNDATLOW_LIMUP_LIMUNITS
0007-001Chemistry1SodiumUniversity Medical Center18-Feb-09135145mMol/L
0007-001Chemistry1SodiumUAZCC-North15-May-12130150mMol/L
0007-001Chemistry2PotassiumUniversity Medical Center18-Feb-093.55.5mMol/L
0007-001Chemistry2PotassiumUAZCC-North15-May-123.85.9mMol/L

I am merging on LAB_ID, TSTTYPCD, ORDSEQNO, and TSTCD (in that order).  However, the new data set needs to matches up the lab results with the lab normal values based on dates (COLL_DT for lab test results and LNDAT for lab normal values).  The lab normal values line up with a lab test based on the last LNDAT before COLL_DT.  I cannot simply add dates into the merging groups because they are not identical values.  I hope this makes sense.

The new table should look like this:

PATNOLAB_IDTSTTYPCDORDSEQNOTSTCDCOLL_DTRESULTLABNAMELNDATLOW_LIMUP_LIMUNITS
37280007-001Chemistry1Sodium7-May-12139University Medical Center18-Feb-09135145mMol/L
37280007-001Chemistry1Sodium21-May-12139UAZCC-North15-May-12130150mMol/L
37280007-001Chemistry1Sodium1-Jun-12137UAZCC-North15-May-12130150mMol/L
37280007-001Chemistry2Potassium7-May-123.5University Medical Center18-Feb-093.55.5mMol/L
37280007-001Chemistry2Potassium21-May-123.0UAZCC-North15-May-123.85.9mMol/L
37280007-001Chemistry2Potassium1-Jun-123.3UAZCC-North15-May-123.85.9mMol/L

Any suggestions on how to approach this?

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Someone more proficient with proc sql, than I am, will have to show you how this REALLY ought to be done.  However, I think that the following does what you want:

proc sql;

  CREATE TABLE first AS

    SELECT p.*, l.LABNAME, l.LNDAT, l.LOW_LIM,l.UP_LIM,l.UNITS

      FROM patients p inner join lab l

         on p.LAB_ID=l.LAB_ID and

               p.TSTTYPCD=l.TSTTYPCD and

               p.ORDSEQNO=l.ORDSEQNO and

               p.TSTCD=l.TSTCD and

            p.coll_dt gt l.lndat

  ;

  create table want as

    select *

      from first

        group by LAB_ID,TSTTYPCD,ORDSEQNO,TSTCD,COLL_DT

          having lndat=max(lndat)

  ;

quit;

View solution in original post

7 REPLIES 7
art297
Opal | Level 21

Since you are trying to accomplish a many-to-many merge, I would think that proc sql will be the easiest way.  What are your criteria for including records based on date (i.e., a range of how many days before and after)?

djbateman
Lapis Lazuli | Level 10

There is no number-of-days requirement.  I am just using the last lab normal values assessment before the lab test results assessment.  I will play around with SQL to see what I can come up with.

Just a side note, I believe I sat by you at the MWSUG in Minneapolis during the Monday night dinner.  We didn't get to speak though.  I meant to see if you spoke Russian practice my русский on you.

art297
Opal | Level 21

Yes we did but, no, I don't speak any Russian.  It was an excellent conference .. I presented four papers.

Back to your question, you might be able to get away with something as simple as:

proc sql;

  CREATE TABLE BOTH AS

    SELECT p.*, l.LABNAME, l.LNDAT, l.LOW_LIM,l.UP_LIM,l.UNITS

      FROM patients p, lab l

        WHERE  p.LAB_ID=l.LAB_ID and

               p.TSTTYPCD=l.TSTTYPCD and

               p.ORDSEQNO=l.ORDSEQNO and

               p.TSTCD=l.TSTCD

;

quit;

and, if you want to include a date range comparison, that is fairly simple and straight forward in sql.

djbateman
Lapis Lazuli | Level 10

This almost gets me there.  It double/triple counts everything.  If there is only one LNDAT for a given LAB_ID, then it works great.  But if there are two LNDAT's, then it double counts; if there are three LNDAT's, then it triple counts, etc.  I tried to simplify by deleting all records where LNDAT<COLL_DT.

That helped quite a bit, but for the example that I gave, LNDAT of 18-Feb-09 and 15-May-12 are both less than COLL_DT of 21-May-12 or 1-Jun-12.  Therefore, it captured both sets of lab normal values and double counted.

art297
Opal | Level 21

Someone more proficient with proc sql, than I am, will have to show you how this REALLY ought to be done.  However, I think that the following does what you want:

proc sql;

  CREATE TABLE first AS

    SELECT p.*, l.LABNAME, l.LNDAT, l.LOW_LIM,l.UP_LIM,l.UNITS

      FROM patients p inner join lab l

         on p.LAB_ID=l.LAB_ID and

               p.TSTTYPCD=l.TSTTYPCD and

               p.ORDSEQNO=l.ORDSEQNO and

               p.TSTCD=l.TSTCD and

            p.coll_dt gt l.lndat

  ;

  create table want as

    select *

      from first

        group by LAB_ID,TSTTYPCD,ORDSEQNO,TSTCD,COLL_DT

          having lndat=max(lndat)

  ;

quit;

Haikuo
Onyx | Level 15

Art, I didn't test it, but I see no reason you can't combine those two steps together?

proc sql;

  CREATE TABLE first AS

    SELECT p.*, l.LABNAME, l.LNDAT, l.LOW_LIM,l.UP_LIM,l.UNITS

      FROM patients p inner join lab l

         on p.LAB_ID=l.LAB_ID and

               p.TSTTYPCD=l.TSTTYPCD and

               p.ORDSEQNO=l.ORDSEQNO and

               p.TSTCD=l.TSTCD and

            p.coll_dt gt l.lndat

        group by p.LAB_ID,p.TSTTYPCD,p.ORDSEQNO,p.TSTCD,p.COLL_DT

          having lndat=max(lndat)

  ;

quit;

Haikuo

art297
Opal | Level 21

  Absolutely no reason and that is precisely why I suggested it.  Nice job!

  I changed your last 3 patient records to reflect an ORDSEQNO of 2 in order to achieve the results you were expecting.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1073 views
  • 3 likes
  • 3 in conversation