Hi! I have 10 Diagnosis variables (DX1-DX10) and we are creating one new variable (CMDX). We are scanning the 10 DX variable values and as soon as SAS reads 1 of 5 specific DX codes (DOD0301 DOD0302 DOD303 etc...) then we want to copy the DX code so it is the value in the new variable (CMDX).
How might we accomplish this?
Thanks!
What if you have multiple matches...which you most likely will?
You can use either a DO loop with an array or WHICHC function or a combination.
data want;
set have;
array find_dx(5) $ _temporary_ ('E234.34', 'E234.24', 'E456.45', 'B345.23', 'I249.25');
array dx(10) dx1-dx10;
do i=1 to 5;
if whichc(find_dx(i), of dx(*)) then do;
...INSERT YOUR SAS code here;
end;
end;
run;
@Reeza we are only interested copying the very first match to the new variable.
Then look at changing my loop criteria to end if the value is populated, but either way add your code to the DO/END loop and it should work fine.
@Reeza Thank you for your help! I am still not sure shat to write in the "Insert your SAS code here". I need it to Copy that identified DX value into the new variable field "CMDX". Here is what I have so far...
data palms.FM01_DX;
set palms.fm01_ela;
array find_dx(4) $ _temporary_ ('DOD0301', 'DOD0302', 'DOD0303', 'Z0289');
array dx(10) dx1-dx10;
do i=1 to 4;
if which (find_dx(i), of dx(*)) then do;
...INSERT YOUR SAS code here;
end;
end;
run;
data palms.FM01_DXCPT;
set palms.FM01_dx;
array find_cpt(5) $ _temporary_ ('G9002', 'G9005', 'G9009', 'G90010',
'G90011');
array cpt(10) cpt_1-cpt_10;
do i=1 to 5;
if which (find_cpt(i), of cpt(*)) then do;
...INSERT YOUR SAS code here;
end;
end;
run;
Why not give it a try? I'm happy to help debug and problem solve.
@Reeza I used this code, and it partially works, but not quite. It is only populating CMDX when it reads the first DX 'DOD0301'. All the observations should have the CMDX variable populated with onle of the four DX's because all observations contain at least one of these diagnoses... Any help debugging?
data palms.FM01_DX;
set palms.fm01_ela;
array find_dx(4) $ _temporary_ ('DOD0301', 'DOD0302', 'DOD0303',
'Z0289');
array dx(10) dx1-dx10;
do i=1 to 4;
if whichc (find_dx(i), of dx(*)) then do;
CMDX = dx(i);
end;
end;
run;
Try the following modification - it only fills it out once the first condition is met. As I mentioned it's likely that you may have multiple conditions. Beyond that, it looks correct. If it doesn't work, you would need to provide sample data and expected output. From your previous answer you don't need two data steps for the processes. You can declare multiple arrays/loops per data step so you should be able to merge your two data steps into one.
EDIT: Small modification to the do loop condition as well.
data palms.FM01_DX;
set palms.fm01_ela;
array find_dx(4) $ _temporary_ ('DOD0301', 'DOD0302', 'DOD0303',
'Z0289');
array dx(10) dx1-dx10;
do i=1 to 4;
if whichc (find_dx(i), of dx(*))>0 then do;
if missing(CMDX) then CMDX = dx(i);
end;
end;
run;
@Reeza it seems to almost work, but all the values are now populated with "."
I checked variable format, and it is informat Best12. Is this the problem?
DX array should be declared with a $ sign because it holds character values.
array dx(10) $ dx1-dx10;
@ReezaI think I'm missing one more $ somewhere? I am still getting the "." and CMDX is still BEST12.
data palms.FM01_DX;
set palms.fm01_ela;
array find_dx(4) $ _temporary_ ('DOD0301', 'DOD0302', 'DOD0303',
'Z0289');
array dx(10)$ dx1-dx10;
do i=1 to 4;
if whichc (find_dx(i), of dx(*))>0 then do;
if missing(CMDX) then CMDX = dx(i);
end;
end;
run;
Do you already have a variable CMDX on the dataset before this step? Is it numeric or character?
You can try to explicity declare CMDX as a character but you shouldn't have too.
data palms.FM01_DX;
set palms.fm01_ela;
length CMDX $8.;
format CMDX $8.;
array find_dx(4) $ _temporary_ ('DOD0301', 'DOD0302', 'DOD0303',
'Z0289');
array dx(10)$ dx1-dx10;
do i=1 to 4;
if whichc (find_dx(i), of dx(*))>0 then do;
if missing(CMDX) then CMDX = dx(i);
end;
end;
run;
@Reeza , I just removed the whichc function and it seemed to work fine! Thank you for your help!
data palms.FM01_DX;
set palms.fm01_ela;
length CMDX $8.;
format CMDX $8.;
array dx(*) DX1-DX10;
do i= 1 to dim(dx);
if dx(i)in: ('DOD0301', 'DOD0302', 'DOD0303','Z0289') then do;
if missing(CMDX) then CMDX = dx(i);
Output;
return;
end;
end;
run;
I re-ran this syntax on the same data, and everything seems to work fine . . . except that records that do not have the specific ICD 10 codes are deleted. So, before running the syntax, I received this output about the dataset:
NOTE: There were 5267 observations read from the data set PALMS.CMFM0103.
NOTE: The data set PALMS.DATE has 5267 observations and 33 variables.
Next I ran this syntax:
data palms.DX;
set palms.date;
length CMDX $8.;
format CMDX $8.;
array dx(*) DX1-DX10;
do i= 1 to dim(dx);
if dx(i)in: ('DOD0301', 'DOD0302', 'DOD0303','Z0289') then do;
if missing(CMDX) then CMDX = dx(i);
Output;
return;
end;
end;
run;
And here's the log info:
NOTE: There were 5267 observations read from the data set PALMS.DATE.
NOTE: The data set PALMS.DX has 5179 observations and 35 variables.
So the data went from 5,267 records down to only 5,179 records.
What changes can I make to the code so that the value of the new variable CMDX = missing for those records that do not include any of the 4 specific ICD 10 codes listed in the syntax?
Any suggestions will be most appreciated!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.