BookmarkSubscribeRSS Feed
LuisMijares
Calcite | Level 5

Hello, I have a quick question, if I have an array defined below like array_1 and I have a second array constructed from 25 variables, can I use the  in operator to search for elements in the manner below? thank you 

 

if (DGNS[i]) in array_1 then indicator_variable =1 

 

array_1 = ("abc", "cfr", "ser"); 

DGNS[25] DGNS_1_CD-DGNS_25_CD; 

15 REPLIES 15
ballardw
Super User

Not quite that way. SAS has set up the IN operator to want values not variables.

 

However there are two functions, WHICHN and WHICHC, N for numeric and C for character values that will search for one value in a list that can be an array.

 

This is not an array: array_1 = ("abc", "cfr", "ser"); 

That is a syntax error.

To assign values to an array like that would look like:

array array_1 (3) $ 3 ('abc', 'cfr','ser');

Which says make an array with 3 elements, the values will be character of length 3 characters and then the list initiates the values in the array. If you don't need the variables that will be created (array_11, array_12 and array_13) after the data step ends you can make the array _temporary_.

 

Then to search for an element of the array DGNS the syntax would be

if Whichc(DGNS[i],of array_1(*)) > 0 then <do whatever when found>;

The Which functions return the position of a value found in a list of values, the array_1 in this case, so a value greater than 0 indicates it is found,  otherwise the function returns 0.

 

Note that these will be comparisons of equality, so if the value of DGNS[1] is 'abcd' it will not match 'abc',

LuisMijares
Calcite | Level 5

Hello, one more question, to elaborate I have this problem,  I want to find instances of ICD-10, medical codes in the the variables DGNS_1_CD-DGNS_25_CD; 

 

I made the following arrays, not exhaustive. I want to acomplish three things, 1. I want to create indicator variables for each disease, so a depression_indicator, nonalzhimers_indicator, etc, I want to make sure the codes for the other diseaseas are assigned 0, so in the depression_indicator codes for nonalzhimers, alzhimers and pneimonia will be 0 and I want to assign -1 to codes that I don't know, there might be codes for diseases that are not listed in the. I was trying to use the following code, any ideas would be helpful 

 

array DGNS $ DGNS_1_CD -- DGNS_25_CD;
DEPRESSION_MEDPAR = .;
NONALZH_DEMEN_MEDPAR = .;
ALZH_MEDPAR = .;
PNEUMO_MEDPAR = .;
hf_medpar = .;

array depression_codes[50] $8 _temporary_ 

array nonalzhimers_codes[84] $8 _temporary_ 

array alzhimers_codes[4] $8 _temporary_ 

array pneumonia_codes[93] $8 _temporary_( 

do i = 1 to dim(DGNS);
if strip(DGNS[i]) in depression_codes then DEPRESSION_MEDPAR=1;
* this would find the other codes and set equal to 0;

if (strip(DGNS[i]) in nonalzhimers_codes or strip(DGNS[i]) in pneumonia_codes or strip(DGNS[i]) in hf_codes
or strip(DGNS[i]) in prknsn_codes or strip(DGNS[i]) in stroke_codes
or strip(DGNS[i]) in stroke_exclusion_codes or strip(DGNS[i]) in anxiety_codes
or strip(DGNS[i]) in bipolar_codes or strip(DGNS[i]) in TBI_codes
or strip(DGNS[i]) in DRUG_USE_CODES or strip(DGNS[i]) in PSYCH_CODES
or strip(DGNS[i]) in OUD_CODES) then DEPRESSION_MEDPAR = 0;*/



Tom
Super User Tom
Super User

Your code cannot work as is because you did not code the ARRAY statements properly, but I don't think that is your actual question.

 

Can you explain (and example will help) with what you want to do with your new indicator variables?

I suspect what you want is to set say DEPRESSION_MEDPAR  to 1 when there is an ICD code in the DGNS array that indicates depression.

 

So first set them all to FALSE and then loop over your DX array setting them to TRUE when the code matches.

So let's assume your codes exist in variables named DX1 to DX50.  So here is example for how to create DEPRESSION and STROKE binary indicator variables.

data want;
* Read in data ;
  set have;
* Create array pointing to variables with patients ICD codes ;
  array dx dx1-dx50;
* Create temporary codes with lists of code per condition ;
  array _depression[2] $8 _temporary_ ('code1' 'code2' );
  array _stroke[3] $8 _temporary_ ('code3' 'code4' 'code5');
* Set condition flags to false;
  depression=0;
  stroke=0;
* Loop over all DX codes. Set flags true when current dx code is in the list for that condition ;
  do i=1 to dim(dx);
    if not missing(dx[i]) then do;
      if dx[i] in _depression then depression=1;
      if dx[i] in _stroke then stroke=1;
    end;
  end;
  drop i;
run;

 

 

 

 

Your ARRAY definitions should look like this:

array depression_codes[50] $8 _temporary_ ('code1' 'code2' .... );

Where you have at most 50 quoted strings inside of the ( )'s.

 

LuisMijares
Calcite | Level 5

Yes, for example the indicator variable Depression_indicator should equal 1 when any of the codes in the DGNS_1_CD-DGNS_25_CD are equal to the codes in the depresion array, I also want it to equal 0 when there are codes for the other illnesses, so if there are codes for anxiety, alzhimers, pneumonia then the Depression_indicator should be equal to 0, if there are codes that I don't know about that are not codes in the alzhimers, pneumonia, depression, arrays then they should be -1. 

Tom
Super User Tom
Super User

@LuisMijares wrote:

Yes, for example the indicator variable Depression_indicator should equal 1 when any of the codes in the DGNS_1_CD-DGNS_25_CD are equal to the codes in the depresion array, I also want it to equal 0 when there are codes for the other illnesses, so if there are codes for anxiety, alzhimers, pneumonia then the Depression_indicator should be equal to 0, if there are codes that I don't know about that are not codes in the alzhimers, pneumonia, depression, arrays then they should be -1. 


Still does not make any sense. Why would having ANXIETY mean they do not also have DEPRESSION for example.  The comorbidity rates of those two conditions should actually be relatively high.

 

Provide a few example subjects and explain what you want to happen.  Just use a few observations and few collected DX codes and a few codes for each condition.

LuisMijares
Calcite | Level 5

well there would be two variables a DEPRESSION_indicator and a Anxiety_indicator the depression_indicator will be 1 if there is a depression code in the DGNS_1_CD-DGNS_25_CD variables and the Anxiety_indicator will be 1 if there are anxiety codes, a patient with anxiety and depression would just have the varibales Depression_indicator == 1 and Anxiety_indicator =-1; 

 

I hope this makes sense 

 

Tom
Super User Tom
Super User

@LuisMijares wrote:

well there would be two variables a DEPRESSION_indicator and a Anxiety_indicator the depression_indicator will be 1 if there is a depression code in the DGNS_1_CD-DGNS_25_CD variables and the Anxiety_indicator will be 1 if there are anxiety codes, a patient with anxiety and depression would just have the varibales Depression_indicator == 1 and Anxiety_indicator =-1; 

 

I hope this makes sense 

 


So it does not make any medical sense, but if you want to implement that then do it AFTER you have created the flags.

if depression=1 and anxiety=1 then anxiety=-1;
LuisMijares
Calcite | Level 5

the Depresison and Anxiety codes are mutually exclusive 

array depression_codes[50] $8 _temporary_ ('F0631', 'F0632', 'F310', 'F3110', 'F3111', 'F3112', 'F3113',) 

array anxiety_codes[49] $8 _temporary_ ('F064', 'F4000', 'F4001', 'F4002',) 

 

if the code "F0631" is in the variables DGNS_1_CD-DGNS_25_CD then Depression_indicator should be 1 if the code is is "F064" then the Depression_indicator should be 0  

Tom
Super User Tom
Super User

@LuisMijares wrote:

the Depresison and Anxiety codes are mutually exclusive 

array depression_codes[50] $8 _temporary_ ('F0631', 'F0632', 'F310', 'F3110', 'F3111', 'F3112', 'F3113',) 

array anxiety_codes[49] $8 _temporary_ ('F064', 'F4000', 'F4001', 'F4002',) 

 

if the code "F0631" is in the variables DGNS_1_CD-DGNS_25_CD then Depression_indicator should be 1 if the code is is "F064" then the Depression_indicator should be 0  


Of course the groups of codes that indicate an condition are different.

But you indicated that each observation had up to 50 different DX codes.  So some of them could indicated depression and others indicate anxiety.  Example:

data have;
   length id dx1-dx5 $8 ;
   input id dx1-dx5 ;
cards;
1 F0631 F064 . . .
;

How to you want your new variables set in that case?

 

 

 

LuisMijares
Calcite | Level 5
ID dx1 dx2 Depression_indicator Anxiety indicator 
1 F0631F064 10
2 F0631F064 10
3F064F0670-1
4f064F063110

 

this is what I had in mind, for example the code "F0671" is not a known code so the anxiety indicator should indicate -1. 

Tom
Super User Tom
Super User

Still makes no sense.  

If the goal is to treat having depression as more important then having anxiety then make those decisions after you have checked all of the DX codes and generated your depression and anxiety indicator variables.

LuisMijares
Calcite | Level 5
ID dx1 dx2 Depression_indicator Anxiety indicator 
1 F0631F064 10
2 F0631F064 10
3F064F0670-1
4f064F063101

 

I just want an indicator variable, this is just an example, code "F0631" indicates depression code "F064" indicates anxiety. I want to make each indicator variable 1, if the code is their respective illness, 1 if depression for example, 0 if other code, and -1 if the code is unknown. 

 

 

LuisMijares
Calcite | Level 5

this is what I was thinking of doing, I was just wondering if it was possible to use the in function on an array called depression_codes 

data_null__
Jade | Level 19

There is an IN ARRAY syntax.

 

49         data _null_;
50            array array_1 [3] $ 3 ('abc', 'cfr','ser');
51            diag = 'cfr';
52            if diag in array_1 then do;
53               found = 1;
54               put _all_;
55               end;
56            run;

array_11=abc array_12=cfr array_13=ser diag=cfr found=1 _ERROR_=0 _N_=1

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 606 views
  • 3 likes
  • 5 in conversation