BookmarkSubscribeRSS Feed
CathyVI
Lapis Lazuli | Level 10

Hello,

I want to use array to find unique icd code in a dataset. The first code is an icd diagnoses code and the second is a procedure code. My dataset has millions of record. The diagnosis code is from 1-12 and the procedure code is from 1-25. For the purposes of the example , i used 1-2 for diagnosis and 1-3 for procedure code.  I would like to find any records with either one or multiple codes in the encounter so I can subset to only icd codes of interest. My code below is not doing what I want. Thanks in advance.

 /* Sample data lines */
   data have;
      input patient_id $ 
          proc1 $ 
          proc2 $ 
          diag1 $ 
          diag2 $;
    datalines;
001 A0H7XJZ B4JZZZZ F1020 Z864
002 H0ZXPZZ K4MZZZZ E669 Z733
003 C02DXZZ D0Y0KZZ J45909 F419
;
data medical_records;
set have;
    array proc_code[2] $7 proc1-proc2;    
    array diag_code[3] $10 diag1-diag2;   
    do i = 1 to 2;
 if proc_code [i] in ("H0ZXPZZ", "D0Y0KZZ") then valid_proc=1; else valid_proc=0;
end;
do i = 1 to 3;
if diag_code [i] in ("F1020") then valid_diag=1; else valid_diag =0;
end;
    drop i;
run;



Expected outcome

Patient_id

proc1

proc2

diag1

diag2

diag3

valid_proc

valid_diag

1

A0H7XJZ

B4JZZZZ

F1020

Z864

Z125

0

1

2

H0ZXPZZ

K4MZZZZ

E669

Z733

J492

1

0

3

C02DXZZ

D0Y0KZZ

J45909

F419

F1020

1

1

 

 
 
2 REPLIES 2
Tom
Super User Tom
Super User

At a minimum remove the ELSE clause.  With that the result is always just going to reflect the last variable in the array.  Set the flag(s) to FALSE before the loop(s). 

 

Or just set the flag to the result of the test and stop checking once you find a match.

data medical_records;
  set have;
  array proc_code proc1-proc2;    
  array diag_code diag1-diag2;  
  do i = 1 to dim(proc_code) until(valid_proc);
    valid_proc = proc_code[i] in ("H0ZXPZZ" "D0Y0KZZ");
  end;
  do i = 1 to dim(diag_code) until(valid_diag);
    valid_diag = diag_code[i] in ("F1020") ;
  end;
  drop i;
run;

 

Ksharp
Super User

It is just a querying issue. You could get this by PROC FORMAT or Hash Table.

  data have;
      input patient_id $ 
          proc1 $ 
          proc2 $ 
          diag1 $ 
          diag2 $;
    datalines;
001 A0H7XJZ B4JZZZZ F1020 Z864
002 H0ZXPZZ K4MZZZZ E669 Z733
003 C02DXZZ D0Y0KZZ J45909 F419
;








data _proc;
input _proc $;
cards;
H0ZXPZZ 
D0Y0KZZ
;
data _diag;
input _diag $;
cards;
F1020
;
run;
data want;
if _n_=1 then do;
 if 0 then set _proc _diag;
 declare hash p(dataset:'_proc');
 p.definekey('_proc');
 p.definedone();
 declare hash d(dataset:'_diag');
 d.definekey('_diag');
 d.definedone();
end;
set have;
array _p{*} proc:;
array _d{*} diag:;
valid_proc=0;
valid_diag =0;
do i=1 to dim(_p);
 if p.check(key:_p{i})=0 then valid_proc=1;
end;
do j=1 to dim(_d);
 if d.check(key:_d{j})=0 then valid_diag=1;
end;

drop _: i j;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 704 views
  • 2 likes
  • 3 in conversation