Re: Renumber Occurrences

shellp55 · Posted 10-02-2012 04:37 PM

Hi

I am using Base SAS 9.3. I have a data file where one line of code is one patient visit. For diagnoses they are labeled (and sequenced) as dx1 dxtype1 dx2 dxtype2. Note that there are up to 25 occurrences of diagnosis codes (though not all will have data in all 25) and a diagnosis type only exists if there is a diagnosis in the same occurrence.

What I want to do is search through the data and find the diagnosis that is the type of "M", make it the first occurrence and resequence all other diagnoses within the abstract after that.

data test_grp;
input    @1 AcctNo $4.
   @5 Dx1 $7.
   @12 DxTyp1 $1.
   @13 Dx2 $7.
   @20 DxTyp2 $1.
   @21 Dx3 $7.
   @28 DxTyp3 $1.
   @29 Dx4 $7.
   @36 DxTyp4 $1.;

cards;
0001T814   1 K650 M Y832 9 B962 3
0002T810   M D62   1 Y838 9 O021 3
0004A047   1 A047 2 J189 M I350 3
0005A401   3 I619 M J9609 2 Z515 1
0006Z548   M C61   3 G809 3 E669 3

run;

So in the case of acct# 0001, it should resequence to K650 M T814 1 Y832 9 B962 3. K650 will now be Dx1, M will be DxTyp1, T814 will be Dx2, DxTyp2 will be 1 and so on.

Is this possible? Thanks very much.

shellp55 · Posted 10-02-2012 04:39 PM

P.S.

How do I copy code into my postings so they are correctly formatted? If I just copy and paste then key words are deleted and the text becomes double spaced. Thanks.

Reeza · Posted 10-02-2012 05:06 PM

You should be able to copy and paste if you use FireFox or Chrome, just not IE.

My suggestion would be to transpose the data, order it in the fashion you need, renumber it, the retranspose.

Obviously this depends on how big your data set is on whether or not its efficient.

MikeZdeb · Posted 10-02-2012 06:15 PM

hi ...it might be easier than this, but it works ...

data x;

informat dx1-dx4 $4. dxtyp1-dxtyp4 $1.;

input acctno dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;

format acct: z4.;

datalines;

0001 T814 1 K650 M Y832 9 B962 3

0002 T810 M D62 1 Y838 9 O021 3

0004 A047 1 A047 2 J189 M I350 3

0005 A401 3 I619 M J9609 2 Z515 1

0006 Z548 M C61 3 G809 3 E669 3

;

data y;

set x;

array dx(4) dx1-dx4;

array dt(4) dxtyp1-dxtyp4;

array tx(4) $4 _temporary_;

array tt(4) $1 _temporary_;

loc = whichc('M', of dxtyp:);

tx(1) = dx(loc);

tt(1) = dt(loc);

k=2;

do j=1 to 4;

if j eq loc then continue;

tx(k) = dx(j);

tt(k) = dt(j);

k = k + 1;

end;

do j=1 to 4;

dx(j) = tx(j);

dt(j) = tt(j);

end;

call missing(of tx(*), of tt(*));

keep acct: dx: ;

run;

dx1 dx2 dx3 dx4 dxtyp1 dxtyp2 dxtyp3 dxtyp4 acctno

K650 T814 Y832 B962 M 1 9 3 0001

T810 D62 Y838 O021 M 1 9 3 0002

J189 A047 A047 I350 M 1 2 3 0004

I619 A401 J960 Z515 M 3 2 1 0005

Z548 C61 G809 E669 M 3 3 3 0006

also a suggestion ... I used LIST INPUT, you were using FORMATTED INPUT to read your data

when your data has a regular pattern of column locations, you can save some keystrokes as follows ...

data test_grp;

input

@01 acctno $4.

@05 (dx1-dx4) ($7. +1)

@12 (dxtyp1-dxtyp4) ($1. +7)

;

<more>

MikeZdeb · Posted 10-02-2012 07:03 PM

hi ... on further thought, fewer arrays ...

data x;

informat dx1-dx4 $4. dxtyp1-dxtyp4 $1.;

input acctno dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;

format acct: z4.;

datalines;

0001 T814 1 K650 M Y832 9 B962 3

0002 T810 M D62 1 Y838 9 O021 3

0004 A047 1 A047 2 J189 M I350 3

0005 A401 3 I619 M J9609 2 Z515 1

0006 Z548 M C61 3 G809 3 E669 3

;

data y;

length ddxx ddtt $50;

set x;

array dx(4) dx1-dx4;

array dt(4) dxtyp1-dxtyp4;

loc = whichc('M', of dxtyp:);

ddxx = catx('/', ddxx, dx(loc));

ddtt = catx('/', ddtt, dt(loc));

do j=1 to 4;

if j eq loc then continue;

ddxx = catx('/', ddxx, dx(j));

ddtt = catx('/', ddtt, dt(j));

end;

do j=1 to 4;

dx(j) = scan(ddxx,j,'/');

dt(j) = scan(ddtt,j,'/');

end;

keep acct: dx: ;

run;

shellp55 · Posted 10-02-2012 07:56 PM

The quick responses on this site amaze me, thanks Mike and Reeza!

I will try your latest one, Mike, and see how it goes. One more wrinkle: what if I have another type to sequence like I want type 6 (of which there will only ever be 1 just like M) in the second occurrence?

Thanks again!

Ksharp · Posted 10-02-2012 11:22 PM

It looks like you are sorting something. The data is stolen from MikeZ .

data x;
informat dx1-dx4 $4. dxtyp1-dxtyp4 $1.;
input acctno dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;
format acct: z4.;
datalines;
0001 T814 1  K650  M Y832  9  B962  3
0002 T810 M  D62   1 Y838  9  O021  3
0004 A047 1  A047  2 J189  M  I350  3
0005 A401 3  I619  M J9609 2  Z515  1
0006 Z548 M  C61   3 G809  3  E669  3
;
run;
data want;
 set x;
array dx{4} dx1-dx4;
array dt{4} dxtyp1-dxtyp4;
do i=1 to 4;
 do j=i+1 to 4;
 if dt{j}='M' then do;
                     _dx=dx{j};_dt=dt{j};
                          dx{j}=dx{i};dt{j}=dt{i};
                     dx{i}=_dx;dt{i}=_dt;
                       end;
 end;
end;
drop i j _: ;
run;

Ksharp

Haikuo · Posted 10-03-2012 09:11 AM

Since all of the variables involved are of the same type (charater), one array can also do:

data want;

retain acctno dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;

set x;

array d dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;

do i=1 to dim(d);

if d(i)='M' then do;

_dx=d(i-1);_dt=d(i); call missing(d(i-1),d(i));leave;

end;

_cat=catx(' ',_dx,_dt,catx(' ', of d(*)));

do i=1 to dim(d);

d(i)=scan(_cat,i);

end;

drop i _:;

run;

Haikuo

Ksharp · Posted 10-03-2012 11:08 PM

Mike and Bian Hai Kuo ,

Your code can work if there is only one M . But sometime if there are two or more M ?

Check it.

0001 T814 1 K650 M Y832 M B962 3

Ksharp

Haikuo · Posted 10-03-2012 11:33 PM

Ksharp:

Quote from OP's first post:

"What I want to do is search through the data and find the diagnosis that is the type of "M", make it the first occurrence and resequence all other diagnoses within the abstract after that."

Looks like that OP only cares about first 'M', then resequence rest of whatever.

Haikuo

Registration is open

SAS Training: Just a Click Away