BookmarkSubscribeRSS Feed
jessho
Calcite | Level 5

Hi,

I am trying to label (as PTSD) and then look at how often the ICD codes 'F43.10' and 'F43.11' are used for seven diagnosis variables (diagnosis_code_1 - diagnosis_code_7).  I was able to do this with array statement for all the other codes except these two -- as now when I run proc freq for my newly created PTSD variable, it comes up as empty (all missing) even though I know that the above codes exist in the data.  Can you please help to troubleshoot?  I cannot figure out why it is working for all the other variables except these?  I have bolded the array that does not seem to be working correctly judging by the 0 frequency when I run proc freq of PTSD

 


data WORK.testunder1;
set work.testunder1;

 

ARRAY stress_array(7) $ diagnosis_code_1-diagnosis_code_7;
DO i= 1 to 7;
IF substr(stress_array(i),1,5) in ('F4310', 'F4312') then PTSD=1;
end;

 

ARRAY anxiety_array(7) $ diagnosis_code_1-diagnosis_code_7;
DO i= 1 to 7;
IF substr(anxiety_array(i),1,3) in ('F41', 'F40') or substr(anxiety_array(i),1,5) in ('F4322') or substr(anxiety_array(i),1,4) in ('F064') then anxiety=1;
end;

 

ARRAY depression_array(7) $ diagnosis_code_1-diagnosis_code_7;
DO i= 1 to 7;
IF substr(depression_array(i),1,3) in ('F32', 'F33') or substr(depression_array(i),1,5) in ('F4321', 'F4323') or substr(depression_array(i),1,4)
in ('F341')
then depression=1;
end;

 

ARRAY bipolar_array(7) $ diagnosis_code_1-diagnosis_code_7;
DO i= 1 to 7;
IF substr(bipolar_array(i),1,3) in ('F31') then bipolar=1;
end;

run;

 

Thank you!

6 REPLIES 6
PaigeMiller
Diamond | Level 26

You didn't provide any sample data, which would help a lot. Nevertheless, your paragraph of explanation does not match your SAS code.

 

In your paragraph of explanation, you are looking for code that are text strings that contain the dot character (as well as other characters). In your SAS code, the dot is missing.

--
Paige Miller
Astounding
PROC Star

Also note, by using the colon operator, you can simplify the code and make it more readable.  Here's one section that you have now:

 

ARRAY depression_array(7) $ diagnosis_code_1-diagnosis_code_7;
DO i= 1 to 7;
IF substr(depression_array(i),1,3) in ('F32', 'F33') or substr(depression_array(i),1,5) in ('F4321', 'F4323') or substr(depression_array(i),1,4)
in ('F341')
then depression=1;
end;

Compare that with:

ARRAY depression_array(7) $ diagnosis_code_1-diagnosis_code_7;
DO i= 1 to 7;
IF depression_array(i) in : ('F32', 'F33', 'F4321', 'F4323', 'F341')
then depression=1;
end;

 

Note that even this set of comparisons may be failing to find some of the observations.  The F4323 and F4321 may need to become F43.23 and F43.21 instead.

ballardw
Super User

I strongly suggest not removing any dots from ICD coding. The dots tell you the difference between (hypothetical) F4.310 and F43.10.

I don't know if the F series have the single digit codes but they do exist in other places and removing the dot can seriously effect.

 

Another way to look for specific values out of a large number is to use a custom format that assigns a key word like "Valid" "Match" or similar for the ones of interest and "Invalid" "No Match" for the remainder. This is very easy with the OTHER keyword proc format:

 

Proc format ;

value $icd_pst

'F43.10', 'F43.11' = 'PTSD'

other = 'No PTSD'

;

run;

 

Then in your code you test using

 

if put(variable,&icd_pst.) = 'Match' then <do what you want when found>;

 

This can be extended to search for large numbers of blocks of values if the data is nice, meaning ranges are clearly defined. Since the parts after the . in ICD coding are all digits you can use ranges such as

'F40.0000' - 'F40.9999' = 'Anxiety'

'F41.0000' - 'F41.9999' = 'Anxiety'

 

There reasons involving character comparison rules not to attempt using 'F40.0000' - 'F41.9999' because 'F403.0' is likely to fall in the range.

jessho
Calcite | Level 5

Thanks for your help!

 

I am trying out what you wrote below for when diagnosis_code_1 = F43.10 or F43.11, but still getting some errors, which I presume is do to my poor syntax in the second block(?).  Also, how would I alter this code to capture any time the ICD10 codes F43.10 or F43.11 appear for any of the diagnostic variables (diagnosis_code_1-diagnosis_code_7)?

 

Proc format;
value $icd_pst
'F43.10', 'F43.11' = 'PTSD'
other='No PTSD';
run;

 

data test;
set WORK.testunder1;
If put(diagnosis_code_1,&icd_pst.) = 'Match' then PTSD=1;
run;

 

Thank you so much!

ballardw
Super User

@jessho wrote:

Thanks for your help!

 

I am trying out what you wrote below for when diagnosis_code_1 = F43.10 or F43.11, but still getting some errors, which I presume is do to my poor syntax in the second block(?).  Also, how would I alter this code to capture any time the ICD10 codes F43.10 or F43.11 appear for any of the diagnostic variables (diagnosis_code_1-diagnosis_code_7)?

 

Proc format;
value $icd_pst
'F43.10', 'F43.11' = 'PTSD'
other='No PTSD';
run;

 

data test;
set WORK.testunder1;
If put(diagnosis_code_1,&icd_pst.) = 'PTSD' then PTSD=1;
run;

 

Thank you so much!


Wrong character, $ for formats, & for macro variable.

And make sure the value compared is the value used in  the Format.

You can even modify the code to be

 

PTSD = (Put(diagnosis_code_1,$icd_pst.) = 'PTSD');

Which would assign 0 when diagnosis_code_1 is anything except the two values, including blank or missing. SAS logic comparison will return numeric 1 for 'True' and 0 for 'False'.

jessho
Calcite | Level 5

Thank you!  That fixed it for when I am labeling the diagnosis_code_1.  If I wanted to modify the code to capture all the PTSD codes listed in all seven diagnosis codes (diagnosis_code_1 - diagnosis_code_7), how would you amend the code?  I tried to - versus or statements with the other diagnosis code variables but continue to get errors. Thank you!

 

Proc format;
value $icd_pst
'F43.10', 'F43.11' = 'PTSD'
other='No PTSD';
run;

 

data test;
set WORK.testunder1;
If Put(diagnosis_code_1,$icd_pst.) = 'PTSD' then PTSD=1; run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1278 views
  • 0 likes
  • 4 in conversation