Splitting Rows

Reply
New Contributor
Posts: 2

Splitting Rows

Can anyone recommend a way to get a dataset of claims and diagnosis, into rows with no more than 10 diagnosis per row? 

The number of diagnosis per claim varies and I never know what the max number of diags on a claim will be.

The code below will create a simple example of the raw dataset I'm working with.  I know it has to transposed (in some way) but I'm not sure about the best way to get it to transpose by groups of 10 diags, and be able to do that with various number of diags per claim.

Any help is appreciated, thanks.

data diags;

input claimid $12. diag $;

datalines;

097389172812     4500

097389172812     13

097389172812     4587

097389172812     458

097389172812     9678

097389172812     128

097389172812     986

854356214255     7895

854356214255     45

854356214255     9635

854356214255     574

854356214255     5913

854356214255     974

854356214255     2685

854356214255     857

854356214255     356

854356214255     851

854356214255     5214

854356214255     69

854356214255     111

854356214255     999

854356214255     5857

854356214255     674

854356214255     853

854356214255     276

854356214255     4975

854356214255     363

854356214255     661

854356214255     454

854356214255     751

854356214255     902

854356214255     024

854356214255     422

854356214255     190

006285475124     0052

006285475124     5264

006285475124     886

006285475124     55

006285475124     941

854356214255     504

854356214255     450

854356214255     5020

854356214255     7164

854356214255     251

854356214255     994

854356214255     574

854356214255     6637

854356214255     905

854356214255     843

854356214255     4674

854356214255     8012

854356214255     832

;

run;

Super User
Super User
Posts: 6,394

Re: Splitting Rows

Your data is not sorted by CLAIMID.  You should either sort it or add the NOTSORTED option to the BY statement in the code below.

data want ;

  if 0 then set diags;

  seq+1;

  array d $5 diag1-diag10;

  do _n_=1 to 10 until (last.claimid);

    set diags;

    by claimid ;

    if first.claimid then seq=1;

    d(_n_)=diag;

  end;

  drop diag;

run;

New Contributor
Posts: 2

Splitting Rows

Tom - absolutely perfect!  I suspected a do-loop was involved, just don't have the experience with them.

Thanks again.

C. Sargent

Regular Contributor
Posts: 184

Splitting Rows

Tom's solution is fine if you really need the specified structure (no more than 10  DIAGs per obs. and possibly multiple obs. per CLAIMID). It's good for a compact dump as generated by

proc print data=want ;

by claimid notsorted ;

id claimid ;

run ;

But for any kind of subsequent processing you are likely better served by either the given structure (one CLAIMID/DIAG pair per obs.) or a complete transpose (one obs. per CLAIMID). You can get SAS to discover the maximum number of DIAG values to be accommodated. Try

proc transpose data=diags out=wide prefix=diag ;

by claimid notsorted ;

var diag ;

run ;

Ask a Question
Discussion stats
  • 3 replies
  • 377 views
  • 0 likes
  • 3 in conversation