BookmarkSubscribeRSS Feed
shellp55
Quartz | Level 8

Hi

I am using Base SAS 9.3.  I have a data file where one line of code is one patient visit.  For diagnoses they are labeled (and sequenced) as dx1 dxtype1 dx2 dxtype2.  Note that there are up to 25 occurrences of diagnosis codes (though not all will have data in all 25) and a diagnosis type only exists if there is a diagnosis in the same occurrence.

What I want to do is search through the data and find the diagnosis that is the type of "M", make it the first occurrence and resequence all other diagnoses within the abstract after that.

data test_grp;
input    @1 AcctNo  $4. 
   @5 Dx1  $7.
   @12 DxTyp1  $1.
   @13 Dx2  $7.
   @20 DxTyp2  $1.
   @21 Dx3  $7.
   @28 DxTyp3  $1.
   @29 Dx4  $7.
   @36 DxTyp4  $1.;
  

cards;
0001T814   1 K650  M Y832  9 B962  3
0002T810   M D62   1 Y838  9 O021  3
0004A047   1 A047  2 J189  M I350  3
0005A401   3 I619  M J9609 2 Z515  1
0006Z548   M C61   3 G809  3 E669  3

run;

So in the case of acct# 0001, it should resequence to K650 M T814 1 Y832 9 B962 3.  K650 will now be Dx1, M will be DxTyp1, T814 will be Dx2, DxTyp2 will be 1 and so on.

Is this possible?  Thanks very much.

9 REPLIES 9
shellp55
Quartz | Level 8

P.S.

How do I copy code into my postings so they are correctly formatted?  If I just copy and paste then key words are deleted and the text becomes double spaced.  Thanks.

Reeza
Super User

You should be able to copy and paste if you use FireFox or Chrome, just not IE.

My suggestion would be to transpose the data, order it in the fashion you need, renumber it, the retranspose.

Obviously this depends on how big your data set is on whether or not its efficient.

MikeZdeb
Rhodochrosite | Level 12

hi ...it might be easier than this, but it works ...

data x;

informat dx1-dx4 $4. dxtyp1-dxtyp4 $1.;

input acctno dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;

format acct: z4.;

datalines;

0001 T814 1  K650  M Y832  9  B962  3

0002 T810 M  D62   1 Y838  9  O021  3

0004 A047 1  A047  2 J189  M  I350  3

0005 A401 3  I619  M J9609 2  Z515  1

0006 Z548 M  C61   3 G809  3  E669  3

;

data y;

set x;

array dx(4) dx1-dx4;

array dt(4) dxtyp1-dxtyp4;

array tx(4) $4 _temporary_;

array tt(4) $1 _temporary_;

loc = whichc('M', of dxtyp:);

tx(1) = dx(loc);

tt(1) = dt(loc);

k=2;

do j=1 to 4;

   if j eq loc then continue;

   tx(k) = dx(j);

   tt(k) = dt(j);

   k = k + 1;

end;

do j=1 to 4;

   dx(j) = tx(j);

   dt(j) = tt(j);

end;

call missing(of tx(*), of tt(*));

keep acct: dx: ;

run;

dx1     dx2     dx3     dx4     dxtyp1    dxtyp2    dxtyp3    dxtyp4    acctno

K650    T814    Y832    B962      M         1         9         3        0001

T810    D62     Y838    O021      M         1         9         3        0002

J189    A047    A047    I350      M         1         2         3        0004

I619    A401    J960    Z515      M         3         2         1        0005

Z548    C61     G809    E669      M         3         3         3        0006

also a suggestion ... I used LIST INPUT, you were using FORMATTED INPUT to read your data

when your data has a regular pattern of column locations, you can save some keystrokes as follows ...

data test_grp;

input

@01 acctno  $4.

@05 (dx1-dx4) ($7. +1)

@12 (dxtyp1-dxtyp4) ($1. +7)

;

<more>

MikeZdeb
Rhodochrosite | Level 12

hi ... on further thought, fewer arrays ...

data x;

informat dx1-dx4 $4. dxtyp1-dxtyp4 $1.;

input acctno dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;

format acct: z4.;

datalines;

0001 T814 1  K650  M Y832  9  B962  3

0002 T810 M  D62   1 Y838  9  O021  3

0004 A047 1  A047  2 J189  M  I350  3

0005 A401 3  I619  M J9609 2  Z515  1

0006 Z548 M  C61   3 G809  3  E669  3

;

data y;

length ddxx ddtt $50;

set x;

array dx(4) dx1-dx4;

array dt(4) dxtyp1-dxtyp4;

loc = whichc('M', of dxtyp:);

ddxx = catx('/', ddxx, dx(loc));

ddtt = catx('/', ddtt, dt(loc));

do j=1 to 4;

   if j eq loc then continue;

   ddxx = catx('/', ddxx, dx(j));

   ddtt = catx('/', ddtt, dt(j));

end;

do j=1 to 4;

   dx(j) = scan(ddxx,j,'/');

   dt(j) = scan(ddtt,j,'/');

end;

keep acct: dx: ;

run;

shellp55
Quartz | Level 8

The quick responses on this site amaze me, thanks Mike and Reeza!

I will try your latest one, Mike, and see how it goes.  One more wrinkle:  what if I have another type to sequence like I want type 6 (of which there will only ever be 1 just like M) in the second occurrence?

Thanks again!

Ksharp
Super User

It looks like you are sorting something. The data is stolen from MikeZ .

data x;
informat dx1-dx4 $4. dxtyp1-dxtyp4 $1.;
input acctno dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;
format acct: z4.;
datalines;
0001 T814 1  K650  M Y832  9  B962  3
0002 T810 M  D62   1 Y838  9  O021  3
0004 A047 1  A047  2 J189  M  I350  3
0005 A401 3  I619  M J9609 2  Z515  1
0006 Z548 M  C61   3 G809  3  E669  3
;
run;
data want;
 set x;
array dx{4} dx1-dx4;
array dt{4} dxtyp1-dxtyp4;
do i=1 to 4;
 do j=i+1 to 4;
 if dt{j}='M' then do;
                     _dx=dx{j};_dt=dt{j};
                          dx{j}=dx{i};dt{j}=dt{i};
                     dx{i}=_dx;dt{i}=_dt;
                       end;
 end;
end;
drop i j _: ;
run;



Ksharp

Haikuo
Onyx | Level 15

Since all of the variables involved are of the same type (charater), one array can also do:

data want;

retain acctno dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;

set x;

  array d dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;

  do i=1 to dim(d);

     if d(i)='M' then do;

           _dx=d(i-1);_dt=d(i); call missing(d(i-1),d(i));leave;

       end;

  end;

  _cat=catx(' ',_dx,_dt,catx(' ', of d(*)));

  do i=1 to dim(d);

    d(i)=scan(_cat,i);

  end;

  drop i _:;

  run;

Haikuo

Ksharp
Super User

Mike and Bian Hai Kuo ,

Your code can work if there is only one M . But sometime if there are two or more M  ?

Check it.

0001 T814 1  K650  M Y832  M B962  3

Ksharp

Haikuo
Onyx | Level 15

Ksharp:

Quote from OP's first post:

"What I want to do is search through the data and find the diagnosis that is the type of "M", make it the first occurrence and resequence all other diagnoses within the abstract after that."

Looks like that OP only cares about first 'M', then resequence rest of whatever.

Haikuo

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1111 views
  • 0 likes
  • 5 in conversation