Hi
I am using Base SAS 9.3. I have a data file where one line of code is one patient visit. For diagnoses they are labeled (and sequenced) as dx1 dxtype1 dx2 dxtype2. Note that there are up to 25 occurrences of diagnosis codes (though not all will have data in all 25) and a diagnosis type only exists if there is a diagnosis in the same occurrence.
What I want to do is search through the data and find the diagnosis that is the type of "M", make it the first occurrence and resequence all other diagnoses within the abstract after that.
data test_grp;
input @1 AcctNo $4.
@5 Dx1 $7.
@12 DxTyp1 $1.
@13 Dx2 $7.
@20 DxTyp2 $1.
@21 Dx3 $7.
@28 DxTyp3 $1.
@29 Dx4 $7.
@36 DxTyp4 $1.;
cards;
0001T814 1 K650 M Y832 9 B962 3
0002T810 M D62 1 Y838 9 O021 3
0004A047 1 A047 2 J189 M I350 3
0005A401 3 I619 M J9609 2 Z515 1
0006Z548 M C61 3 G809 3 E669 3run;
So in the case of acct# 0001, it should resequence to K650 M T814 1 Y832 9 B962 3. K650 will now be Dx1, M will be DxTyp1, T814 will be Dx2, DxTyp2 will be 1 and so on.
Is this possible? Thanks very much.
P.S.
How do I copy code into my postings so they are correctly formatted? If I just copy and paste then key words are deleted and the text becomes double spaced. Thanks.
You should be able to copy and paste if you use FireFox or Chrome, just not IE.
My suggestion would be to transpose the data, order it in the fashion you need, renumber it, the retranspose.
Obviously this depends on how big your data set is on whether or not its efficient.
hi ...it might be easier than this, but it works ...
data x;
informat dx1-dx4 $4. dxtyp1-dxtyp4 $1.;
input acctno dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;
format acct: z4.;
datalines;
0001 T814 1 K650 M Y832 9 B962 3
0002 T810 M D62 1 Y838 9 O021 3
0004 A047 1 A047 2 J189 M I350 3
0005 A401 3 I619 M J9609 2 Z515 1
0006 Z548 M C61 3 G809 3 E669 3
;
data y;
set x;
array dx(4) dx1-dx4;
array dt(4) dxtyp1-dxtyp4;
array tx(4) $4 _temporary_;
array tt(4) $1 _temporary_;
loc = whichc('M', of dxtyp:);
tx(1) = dx(loc);
tt(1) = dt(loc);
k=2;
do j=1 to 4;
if j eq loc then continue;
tx(k) = dx(j);
tt(k) = dt(j);
k = k + 1;
end;
do j=1 to 4;
dx(j) = tx(j);
dt(j) = tt(j);
end;
call missing(of tx(*), of tt(*));
keep acct: dx: ;
run;
dx1 dx2 dx3 dx4 dxtyp1 dxtyp2 dxtyp3 dxtyp4 acctno
K650 T814 Y832 B962 M 1 9 3 0001
T810 D62 Y838 O021 M 1 9 3 0002
J189 A047 A047 I350 M 1 2 3 0004
I619 A401 J960 Z515 M 3 2 1 0005
Z548 C61 G809 E669 M 3 3 3 0006
also a suggestion ... I used LIST INPUT, you were using FORMATTED INPUT to read your data
when your data has a regular pattern of column locations, you can save some keystrokes as follows ...
data test_grp;
input
@01 acctno $4.
@05 (dx1-dx4) ($7. +1)
@12 (dxtyp1-dxtyp4) ($1. +7)
;
<more>
hi ... on further thought, fewer arrays ...
data x;
informat dx1-dx4 $4. dxtyp1-dxtyp4 $1.;
input acctno dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;
format acct: z4.;
datalines;
0001 T814 1 K650 M Y832 9 B962 3
0002 T810 M D62 1 Y838 9 O021 3
0004 A047 1 A047 2 J189 M I350 3
0005 A401 3 I619 M J9609 2 Z515 1
0006 Z548 M C61 3 G809 3 E669 3
;
data y;
length ddxx ddtt $50;
set x;
array dx(4) dx1-dx4;
array dt(4) dxtyp1-dxtyp4;
loc = whichc('M', of dxtyp:);
ddxx = catx('/', ddxx, dx(loc));
ddtt = catx('/', ddtt, dt(loc));
do j=1 to 4;
if j eq loc then continue;
ddxx = catx('/', ddxx, dx(j));
ddtt = catx('/', ddtt, dt(j));
end;
do j=1 to 4;
dx(j) = scan(ddxx,j,'/');
dt(j) = scan(ddtt,j,'/');
end;
keep acct: dx: ;
run;
The quick responses on this site amaze me, thanks Mike and Reeza!
I will try your latest one, Mike, and see how it goes. One more wrinkle: what if I have another type to sequence like I want type 6 (of which there will only ever be 1 just like M) in the second occurrence?
Thanks again!
It looks like you are sorting something. The data is stolen from MikeZ .
data x; informat dx1-dx4 $4. dxtyp1-dxtyp4 $1.; input acctno dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4; format acct: z4.; datalines; 0001 T814 1 K650 M Y832 9 B962 3 0002 T810 M D62 1 Y838 9 O021 3 0004 A047 1 A047 2 J189 M I350 3 0005 A401 3 I619 M J9609 2 Z515 1 0006 Z548 M C61 3 G809 3 E669 3 ; run; data want; set x; array dx{4} dx1-dx4; array dt{4} dxtyp1-dxtyp4; do i=1 to 4; do j=i+1 to 4; if dt{j}='M' then do; _dx=dx{j};_dt=dt{j}; dx{j}=dx{i};dt{j}=dt{i}; dx{i}=_dx;dt{i}=_dt; end; end; end; drop i j _: ; run;
Ksharp
Since all of the variables involved are of the same type (charater), one array can also do:
data want;
retain acctno dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;
set x;
array d dx1 dxtyp1 dx2 dxtyp2 dx3 dxtyp3 dx4 dxtyp4;
do i=1 to dim(d);
if d(i)='M' then do;
_dx=d(i-1);_dt=d(i); call missing(d(i-1),d(i));leave;
end;
end;
_cat=catx(' ',_dx,_dt,catx(' ', of d(*)));
do i=1 to dim(d);
d(i)=scan(_cat,i);
end;
drop i _:;
run;
Haikuo
Mike and Bian Hai Kuo ,
Your code can work if there is only one M . But sometime if there are two or more M ?
Check it.
0001 T814 1 K650 M Y832 M B962 3
Ksharp
Ksharp:
Quote from OP's first post:
"What I want to do is search through the data and find the diagnosis that is the type of "M", make it the first occurrence and resequence all other diagnoses within the abstract after that."
Looks like that OP only cares about first 'M', then resequence rest of whatever.
Haikuo
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.