BookmarkSubscribeRSS Feed
Jinpa88
Calcite | Level 5

Can anyone recommend a way to get a dataset of claims and diagnosis, into rows with no more than 10 diagnosis per row? 

The number of diagnosis per claim varies and I never know what the max number of diags on a claim will be.

The code below will create a simple example of the raw dataset I'm working with.  I know it has to transposed (in some way) but I'm not sure about the best way to get it to transpose by groups of 10 diags, and be able to do that with various number of diags per claim.

Any help is appreciated, thanks.

data diags;

input claimid $12. diag $;

datalines;

097389172812     4500

097389172812     13

097389172812     4587

097389172812     458

097389172812     9678

097389172812     128

097389172812     986

854356214255     7895

854356214255     45

854356214255     9635

854356214255     574

854356214255     5913

854356214255     974

854356214255     2685

854356214255     857

854356214255     356

854356214255     851

854356214255     5214

854356214255     69

854356214255     111

854356214255     999

854356214255     5857

854356214255     674

854356214255     853

854356214255     276

854356214255     4975

854356214255     363

854356214255     661

854356214255     454

854356214255     751

854356214255     902

854356214255     024

854356214255     422

854356214255     190

006285475124     0052

006285475124     5264

006285475124     886

006285475124     55

006285475124     941

854356214255     504

854356214255     450

854356214255     5020

854356214255     7164

854356214255     251

854356214255     994

854356214255     574

854356214255     6637

854356214255     905

854356214255     843

854356214255     4674

854356214255     8012

854356214255     832

;

run;

3 REPLIES 3
Tom
Super User Tom
Super User

Your data is not sorted by CLAIMID.  You should either sort it or add the NOTSORTED option to the BY statement in the code below.

data want ;

  if 0 then set diags;

  seq+1;

  array d $5 diag1-diag10;

  do _n_=1 to 10 until (last.claimid);

    set diags;

    by claimid ;

    if first.claimid then seq=1;

    d(_n_)=diag;

  end;

  drop diag;

run;

Jinpa88
Calcite | Level 5

Tom - absolutely perfect!  I suspected a do-loop was involved, just don't have the experience with them.

Thanks again.

C. Sargent

Howles
Quartz | Level 8

Tom's solution is fine if you really need the specified structure (no more than 10  DIAGs per obs. and possibly multiple obs. per CLAIMID). It's good for a compact dump as generated by

proc print data=want ;

by claimid notsorted ;

id claimid ;

run ;

But for any kind of subsequent processing you are likely better served by either the given structure (one CLAIMID/DIAG pair per obs.) or a complete transpose (one obs. per CLAIMID). You can get SAS to discover the maximum number of DIAG values to be accommodated. Try

proc transpose data=diags out=wide prefix=diag ;

by claimid notsorted ;

var diag ;

run ;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

New Learning Events in April

 

Join us for two new fee-based courses: Administrative Healthcare Data and SAS via Live Web Monday-Thursday, April 24-27 from 1:00 to 4:30 PM ET each day. And Administrative Healthcare Data and SAS: Hands-On Programming Workshop via Live Web on Friday, April 28 from 9:00 AM to 5:00 PM ET.

LEARN MORE

Discussion stats
  • 3 replies
  • 1053 views
  • 0 likes
  • 3 in conversation