BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
epidemiologystu
Calcite | Level 5

Hello,

 

This piece of code is not correctly assigning all breast cancer cases. Could someone identify the error? 

 

*create sample dataset;

 

data sample ;

  input curr_topog_desc $30. ;

  cards ;

Rectum NOS

Overlapping lesion of breast

Head of pancreas

Blood

Breast NOS

" " *meant to be blank;

;

run; 

 

Data sample;

Set sample;

IF index(curr_topog_desc, "breast") ne 0 then do;

breastcancer = 1;

othercancer = 0;

end;

 

*if blank space, the person was not linked to cancer registry;

IF curr_topog_desc = " " THEN do;

breastcancer = 0;

othercancer = 0;

end;

 

*meant to capture all other cancers;

ELSE do;

breastcancer = 0;

othercancer = 1;

end;

 

run;

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @epidemiologystu,

 

There are two issues:

  1. The INDEX function is case-sensitive so that "breast" isn't found in "Breast NOS." Use the FIND function with the 'i' modifier instead:
    find(curr_topog_desc, "breast", 'i')
    or apply the LOWCASE function to the first argument:
    index(lowcase(curr_topog_desc), "breast")
  2. Without an ELSE before the second IF the values assigned in the first THEN branch are overwritten by the last ELSE DO/END block.

 


@epidemiologystu wrote:

data sample ;

  input curr_topog_desc $30. ;

  cards ;

Rectum NOS

Overlapping lesion of breast

Head of pancreas

Blood

Breast NOS

" " *meant to be blank;


Use a single period (.) to enter a missing value in data lines, not a blank in double quotes.

View solution in original post

2 REPLIES 2
FreelanceReinh
Jade | Level 19

Hello @epidemiologystu,

 

There are two issues:

  1. The INDEX function is case-sensitive so that "breast" isn't found in "Breast NOS." Use the FIND function with the 'i' modifier instead:
    find(curr_topog_desc, "breast", 'i')
    or apply the LOWCASE function to the first argument:
    index(lowcase(curr_topog_desc), "breast")
  2. Without an ELSE before the second IF the values assigned in the first THEN branch are overwritten by the last ELSE DO/END block.

 


@epidemiologystu wrote:

data sample ;

  input curr_topog_desc $30. ;

  cards ;

Rectum NOS

Overlapping lesion of breast

Head of pancreas

Blood

Breast NOS

" " *meant to be blank;


Use a single period (.) to enter a missing value in data lines, not a blank in double quotes.