Hi SAS friends,
Am trying to use the index function to locate a string variable "MrgMT"
in a longer string variable "Merge_term "
, but it is not working as expected.
Have checked the source datasets for hidden characters and "purged" them with a compress function (used the "KAS" option to keep only alphanumeric characters and some symbols), and viewed them in note pad looking for bad characters, but still no luck. I've uploaded a sample SAS file with these thee variables.
Here is the code
data lcl.Index_Prob ; set lcl.&FFN._TMP ;
Where MrgMT in ("ALBUTEROL", "ACAMPROSATE")
and
Merge_term in ("4MG OF ALBUTEROL", "ALBUTEROL", "ACAMPROSATE", "ACAMPROSATE (CAMPRAL)");
Merge_No = index(Merge_term, MrgMT) ;
keep MrgMT Merge_term Merge_No ;
run ;
Here is an example of the output, with comments:
Merge_term | MrgMT | Merge_No | Should be |
4MG OF ALBUTEROL | ALBUTEROL | 0 | 8 |
ACAMPROSATE | ACAMPROSATE | 1 | OK ! |
ACAMPROSATE | ALBUTEROL | 0 | OK ! |
ACAMPROSATE (CAMPRAL) | ACAMPROSATE | 0 | 1 |
ALBUTEROL | ALBUTEROL | 1 | OK! |
Any suggestions on why INDEX function is not working?
Thank you !
Nothing wrong with the INDEX function. Be sure to use the TRIM function as in:
Merge_No = index(Merge_term, trim(MrgMT)) ;
Here's an illustration:
30 data have;
31 text='aaa bbb ccc';
32 length longsrch $4;
33 length shrtsrch $3;
34 longsrch='ccc';
35 shrtsrch='ccc';
36 ixlong=index(text,longsrch);
37 ixshrt=index(text,shrtsrch);
38 ixtrim=index(text,trim(longsrch));
39 put (_all_) (=);
40 run;
text=aaa bbb ccc longsrch=ccc shrtsrch=ccc ixlong=0 ixshrt=9 ixtrim=9
NOTE: The data set WORK.HAVE has 1 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
Nothing wrong with the INDEX function. Be sure to use the TRIM function as in:
Merge_No = index(Merge_term, trim(MrgMT)) ;
Here's an illustration:
30 data have;
31 text='aaa bbb ccc';
32 length longsrch $4;
33 length shrtsrch $3;
34 longsrch='ccc';
35 shrtsrch='ccc';
36 ixlong=index(text,longsrch);
37 ixshrt=index(text,shrtsrch);
38 ixtrim=index(text,trim(longsrch));
39 put (_all_) (=);
40 run;
text=aaa bbb ccc longsrch=ccc shrtsrch=ccc ixlong=0 ixshrt=9 ixtrim=9
NOTE: The data set WORK.HAVE has 1 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
Thank you very much , and especially appreciate the explanation !
Hello.
The documentation say:
INDEX(source, excerpt)
Both leading and trailing spaces are considered part of the excerpt argument. To remove trailing spaces, include the TRIM function with the excerpt variable inside the INDEX function.
Maybe you can try:
Merge_No = index(Merge_term, trim(MrgMT));
Thank you AVO, much appreciated !
SAS uses fixed length character variables. So short values are padded with spaces to fill out the variable to its complete length.
Normal equality comparisons (like in your WHERE statement) will ignore the trailing spaces. But the INDEX function does not. If you ask it to look for 'FRED ' in will not match strings that don't have all four spaces following the D.
Use TRIM() to remove the trailing spaces from the string you are asking INDEX to look for.
data lcl.Index_Prob;
set lcl.&FFN._TMP;
where MrgMT in ("ALBUTEROL", "ACAMPROSATE")
and Merge_term in ("4MG OF ALBUTEROL", "ALBUTEROL", "ACAMPROSATE", "ACAMPROSATE (CAMPRAL)")
;
Merge_No = index(Merge_term, trim(MrgMT));
keep MrgMT Merge_term Merge_No ;
run ;
Thank you Tom, and the explanation is very helpful, much appreciated !
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.