BookmarkSubscribeRSS Feed
jessho
Calcite | Level 5

I know how to label each diagnosis_code variable at a time.  Is there a quick way to do this in one step for all my diagnosis code variables (diagnosis_code1, diagnosis_code2, etc). 

 

In other words, if I wanted to modify the code below to capture all the PTSD diagnosis codes for any of the seven diagnosis code variables (diagnosis_code_1 - diagnosis_code_7), how would you amend the code?  I tried to - versus or statements with the other diagnosis code variables but continue to get errors. Thank you!

 

Proc format;
value $icd_pst
'F43.10', 'F43.11' = 'PTSD'
other='No PTSD';
run;

 

data test;
set WORK.testunder1;
If Put(diagnosis_code_1,$icd_pst.) = 'PTSD' then PTSD=1; run;

5 REPLIES 5
Jagadishkatam
Amethyst | Level 16

could you try

 

data test;
set WORK.testunder1;
If strip(Put(diagnosis_code_1,$icd_pst.)) = 'PTSD' then PTSD=1; 
run;
Thanks,
Jag
Astounding
PROC Star

Use arrays to process many variables in the same way:

 

data test;
set WORK.testunder1;
array diags {8} diagnosis_code_1 - diagnosis_code_8;
do k = 1 to 8 until (ptsd=1);
   If Put(diags{k}, icd_pst.) = 'PTSD' then PTSD=1;
end;
run;
ballardw
Super User

If you want a variable that indicates if at least one of a group of variables has a value here is one way:

 

Proc format library=work;
value $icd_pst
'F43.10', 'F43.11' = 'PTSD'
other='No PTSD';
run;

data example;
   infile datalines truncover;
   informat d1 - d7 $8.;
   input d1 -d7;
   array dx d1-d7;
   array temp{7} $ 10 _temporary_ ;
   call missing(of temp(*));
   do i= 1 to dim(dx);
      temp[i]= put(dx[i],$icd_pst.);
   end;
   PSTD = ( whichc('PTSD', of temp[*])>0 );
   drop i;
datalines;
F43.2 F42.1 F15.4
F43.2 F42.1 F15.4 F43.10 F41.4
F43.2
F43.11
;
run;

The array temp defines variables TEMP1 to TEMP7 to hold the formatted value of diagnosis_code (I'm too lazy to use that long of a variable for example so just used D1 to D7).

The call missing sets the array to blank values. Otherwise _temporary_ arrays can hold values across records. Then populates with the values.

The function WHICHC searches for the value of the first parameter, in this case the literal 'PTSD' in the following variables. The "of temp[*] " indicates all of the elements of the Array temp are to be used in the search. The function returns which variable in order a match is found (often useful) . In this case just comparing to see if the result is > 0 , i.e. at least one match was found, is used to set the PTSD flag to 1 when found or 0 otherwise.

 

To search for the other codes from you other post you would repeat this block of code

   call missing(of temp(*));
   do i= 1 to dim(dx);
      temp[i]= put(dx[i],$icd_pst.);
   end;
   PSTD = ( whichc('PTSD', of temp[*])>0 );
   drop i;

for the other code replacing the 1) format $icd_pst, 2) the variable PSTD to the other flag, and 3) the value 'PTSD' with the other formatted value.

 

If you have a largish number of these codes to search for you could build separate arrays to handle the 1,2, and 3 elements above and wrap the repeated code in a do loop that uses the size of the arrays holding those three things and replace them with array references.

An additional change would be to change the put(dx[I],&icd_pst.) to PUTC(dx[I],formatarray[j] ); PUTC will allow having a variable to hold the name of a format but a simple PUT requires the literal text of the variable.

You could use temporary arrays to hold the format name and the search strings but you the array with the flag variable names wouldn't.  This is left as an exercise for the interested reader.

jessho
Calcite | Level 5

Awesome -- thank you.

 

When I run the code above, I get this output:

 

diagnosis_code1 diagnosis_code2 diagnosis_code3 diagnosis_code4 diagnosis_code5 diagnosis_code6 diagnosis_code7 PSTD
F43.10 F43.12 F43.13 F43.9 F43.0 F43.8 1

 

However, I cannot seem to count the number of times the diagnosis of PTSD was made during each visit (PTSD response for variable diagnosis_code1, diagnosis_code2, etc) for each patient in the dataset from this.  Proc freq does not recognize PTSD?  How would I do that?

 

Also, just to get clarity on exactly what $icd_pst pulls up so that I understand the logic, could you explain further?

 

Thank you!

ballardw
Super User

@jessho wrote:

Awesome -- thank you.

 

When I run the code above, I get this output:

 

diagnosis_code1 diagnosis_code2 diagnosis_code3 diagnosis_code4 diagnosis_code5 diagnosis_code6 diagnosis_code7 PSTD
F43.10 F43.12 F43.13 F43.9 F43.0 F43.8 1

 

However, I cannot seem to count the number of times the diagnosis of PTSD was made during each visit (PTSD response for variable diagnosis_code1, diagnosis_code2, etc) for each patient in the dataset from this.  Proc freq does not recognize PTSD?  How would I do that?

 

Also, just to get clarity on exactly what $icd_pst pulls up so that I understand the logic, could you explain further?

 

Thank you!


If you are using this code, it only sets the value to 1, so no "count"

data test;
set WORK.testunder1;
array diags {8} diagnosis_code_1 - diagnosis_code_8;
do k = 1 to 8 until (ptsd=1);
   If Put(diags{k}, icd_pst.) = 'PTSD' then PTSD=1;
end;run;

Likely not the best but

data test;
set WORK.testunder1;
array diags {8} diagnosis_code_1 - diagnosis_code_8;
do k = 1 to 8 until (ptsd=1);
   Ptsd= sum(ptsd,Put(diags{k}, icd_pst.) = 'PTSD');
end;run;

Personally I would transpose the data to a long format with a diagnosis id and separate value and use a single format to count all the diagnosis code groups at one time similar to:

proc format library=work;
value $example
'A','B'='Group1'
'C','D','E' = 'Group 2'
other = 'Everything else'
;
run;

data example;
  input pid vid did dv $;
   label pid='Patient'
         vid='Visit number'
         dv= 'Diagnosis group'
   ;
datalines;
1  1  1  A
1  1  2  B
1  1  3  F
1  2  1  F
1  2  2  F
1  2  3  C
1  2  4  B
2  1  1  A
2  1  2  B
2  2  3  F
2  2  1  F
2  3  2  F
2  3  3  C
2  3  4  B
;
run;



proc tabulate data=example;
   class pid vid dv;
   format dv $example.;
   table pid*vid*dv,
         n='Count'
   ;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1135 views
  • 0 likes
  • 4 in conversation