I am transitioning SAS user from STATA and need some help with the array function. Help would be greatly appreciated !!
There are six columns of diagnosis codes for 20 patients (all fake data)
For each variable below, use the algorithm to write the logic to select and code the IDs correctly, then list the ids that will be in each new variable. | |||||||||||||
HBP
|
Here is the data:
ID | dx1 | dx2 | dx3 | dx4 | dx5 | dx6 |
1 | 401 | 38 | 789.00 | 274.9 | 462 | 536.8 |
2 | 255.5 | 346.90 | 402 | 308.0 | 305.00 | 707.81 |
3 | 403 | 033 | 023 | 580 | 38 | 995.3 |
4 | 455.6 | 401 | 726.32 | 845.00 | 388.30 | 463 |
5 | 1 | 274.0 | 403 | 244.9 | 785.5 | 405 |
6 | 302.71 | 095 | 584 | 708.0 | 787.3 | 487.1 |
7 | 401 | 845.00 | 627.2 | 38 | 723.1 | 311.0 |
8 | 001 | 786.2 | 458.0 | 724.2 | 020.0 | 523.0 |
9 | 473.9 | 401 | 729.1 | 005.9 | 250.0 | 625.4 |
10 | 841.9 | 790.7 | 617.0 | 038 | 780.79 | 715.00 |
11 | 785.5 | 625.3 | 034 | 724.3 | 053.9 | 842.01 |
12 | 309.1 | 564.00 | 402 | 304.00 | 405 | 729.5 |
13 | 460 | 555.9 | 401 | 723.1 | 719.42 | 117.9 |
14 | 625.9 | 304.00 | 787.02 | 401 | 844.9 | 112.0 |
15 | 402 | 038 | 786.2 | 460 | 112.81 | 780.51 |
16 | 078.1 | 847.2 | 790.7 | 354.0 | 402 | 346.1 |
17 | 780.8 | 402 | 785.1 | 458.0 | 278.0 | 783.2 |
18 | 351.0 | 274.0 | 095 | 783.2 | 286.2 | 719.47 |
19 | 724.4 | 020.00 | 729.5 | 562.11 | 787.91 | 460 |
20 | 95 | 580 | 286.2 | 564.00 | 117.9 | 518.85 |
Here's an example for your first one, you should be able to extrapolate from it.
data want;
set have;
array dx(6) dx1-dx6;
hbp=0;
do i=1 to dim(dx);
if dx(i) in (401, 402, 403, 404, 405) then hbp=1;
end;
run;
Thank you so much!
Do I have to run each diagnosis separate (array...to end, run)? Or is there a way to do it all at once....
EXAMPLE.
data want;
set have;
array dx(6) dx1-dx6;
hbp=0;
do i=1 to dim(dx);
if dx:smileyinfo: in (401, 402, 403, 404, 405) then hbp=1;
mspesis=0;
do i=1 to dim(dx);
if dx:smileyinfo: in (038,020.0,790.7,117.9,112.81) then mspesis=1;
etc.
end;
run;
You can put all your if statements in the do loop.
Make sure to initialize the variablea to 0 before the loop.
It is a little easier if you store you DX codes as character. Most usage I have seen store them without the periods.
data have ;
length id 8 dx1-dx6 $6 ;
input id dx1-dx6 ;
cards;
1 401 038 78900 2749 462 5368
2 2555 34690 402 3080 30500 70781
3 403 033 023 580 038 9953
4 4556 401 72632 84500 38830 463
5 001 2740 403 2449 7855 405
6 30271 095 584 7080 7873 4871
7 401 84500 6272 038 7231 3110
8 001 7862 4580 7242 0200 5230
9 4739 401 7291 0059 2500 6254
10 8419 7907 6170 038 78079 71500
11 7855 6253 034 7243 0539 84201
12 3091 56400 402 30400 405 7295
13 460 5559 401 7231 71942 1179
14 6259 30400 78702 401 8449 1120
15 402 038 7862 460 11281 78051
16 781 8472 7907 3540 402 3461
17 7808 402 7851 4580 2780 7832
18 3510 2740 095 7832 2862 71947
19 7244 02000 7295 56211 78791 460
20 095 580 2862 56400 1179 51885
;;;;
data want ;
set have ;
array codes (4) $200 _temporary_
('401 402 403 404 405'
,'038 200 7907 1179 11281'
,'001 023 033 034 095'
,'51881 51882 51885 458 7855 7855 584 580 570 572 2862 2866'
) ;
array flags (4) hbp msepsis asepsis organ_dys ;
array dx (6);
do i=1 to dim(flags) ;
do j=1 to dim(dx) until (flags(i)=1);
flags(i)=0^=indexw(codes(i),dx(j));
end;
end;
drop i j ;
run;
data expected;
input id hbp msepsis asepsis organ_dys ;
cards;
1 1 1 0 0
2 1 0 0 0
3 1 1 1 1
4 1 0 0 0
5 1 0 1 1
6 0 0 1 1
7 1 1 0 0
8 0 0 1 0
9 1 0 0 0
10 0 1 0 0
11 0 0 1 1
12 1 0 0 0
13 1 1 0 0
14 1 0 0 0
15 1 1 0 0
16 1 1 0 0
17 1 0 0 0
18 0 0 1 1
19 0 0 0 0
20 0 1 1 1
run;
proc compare data=expected compare=want ;
id id;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.