- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi SAS friends,
Am trying to use the index function to locate a string variable "MrgMT"
in a longer string variable "Merge_term "
, but it is not working as expected.
Have checked the source datasets for hidden characters and "purged" them with a compress function (used the "KAS" option to keep only alphanumeric characters and some symbols), and viewed them in note pad looking for bad characters, but still no luck. I've uploaded a sample SAS file with these thee variables.
Here is the code
data lcl.Index_Prob ; set lcl.&FFN._TMP ;
Where MrgMT in ("ALBUTEROL", "ACAMPROSATE")
and
Merge_term in ("4MG OF ALBUTEROL", "ALBUTEROL", "ACAMPROSATE", "ACAMPROSATE (CAMPRAL)");
Merge_No = index(Merge_term, MrgMT) ;
keep MrgMT Merge_term Merge_No ;
run ;
Here is an example of the output, with comments:
Merge_term | MrgMT | Merge_No | Should be |
4MG OF ALBUTEROL | ALBUTEROL | 0 | 8 |
ACAMPROSATE | ACAMPROSATE | 1 | OK ! |
ACAMPROSATE | ALBUTEROL | 0 | OK ! |
ACAMPROSATE (CAMPRAL) | ACAMPROSATE | 0 | 1 |
ALBUTEROL | ALBUTEROL | 1 | OK! |
Any suggestions on why INDEX function is not working?
Thank you !
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Nothing wrong with the INDEX function. Be sure to use the TRIM function as in:
Merge_No = index(Merge_term, trim(MrgMT)) ;
Here's an illustration:
30 data have;
31 text='aaa bbb ccc';
32 length longsrch $4;
33 length shrtsrch $3;
34 longsrch='ccc';
35 shrtsrch='ccc';
36 ixlong=index(text,longsrch);
37 ixshrt=index(text,shrtsrch);
38 ixtrim=index(text,trim(longsrch));
39 put (_all_) (=);
40 run;
text=aaa bbb ccc longsrch=ccc shrtsrch=ccc ixlong=0 ixshrt=9 ixtrim=9
NOTE: The data set WORK.HAVE has 1 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Nothing wrong with the INDEX function. Be sure to use the TRIM function as in:
Merge_No = index(Merge_term, trim(MrgMT)) ;
Here's an illustration:
30 data have;
31 text='aaa bbb ccc';
32 length longsrch $4;
33 length shrtsrch $3;
34 longsrch='ccc';
35 shrtsrch='ccc';
36 ixlong=index(text,longsrch);
37 ixshrt=index(text,shrtsrch);
38 ixtrim=index(text,trim(longsrch));
39 put (_all_) (=);
40 run;
text=aaa bbb ccc longsrch=ccc shrtsrch=ccc ixlong=0 ixshrt=9 ixtrim=9
NOTE: The data set WORK.HAVE has 1 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much , and especially appreciate the explanation !
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello.
The documentation say:
INDEX(source, excerpt)
Both leading and trailing spaces are considered part of the excerpt argument. To remove trailing spaces, include the TRIM function with the excerpt variable inside the INDEX function.
Maybe you can try:
Merge_No = index(Merge_term, trim(MrgMT));
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you AVO, much appreciated !
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SAS uses fixed length character variables. So short values are padded with spaces to fill out the variable to its complete length.
Normal equality comparisons (like in your WHERE statement) will ignore the trailing spaces. But the INDEX function does not. If you ask it to look for 'FRED ' in will not match strings that don't have all four spaces following the D.
Use TRIM() to remove the trailing spaces from the string you are asking INDEX to look for.
data lcl.Index_Prob;
set lcl.&FFN._TMP;
where MrgMT in ("ALBUTEROL", "ACAMPROSATE")
and Merge_term in ("4MG OF ALBUTEROL", "ALBUTEROL", "ACAMPROSATE", "ACAMPROSATE (CAMPRAL)")
;
Merge_No = index(Merge_term, trim(MrgMT));
keep MrgMT Merge_term Merge_No ;
run ;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Tom, and the explanation is very helpful, much appreciated !