Help using Base SAS procedures

having trouble with index function using 2 variables.

Accepted Solution Solved
Reply
Contributor
Posts: 38
Accepted Solution

having trouble with index function using 2 variables.

Hi everyone, I'm having trouble with a very basic thing (I'm new to SAS) and I'm hoping there is an easy answer.

Here is how the dataset looks:

var1           var2
CA09|CA10      CA09
CA06|CA09      CA09
CA06           CA09

I am attempting to create a variable called "ismatch" which I intended to help identify rows where var2 is found within var1.

I use proc sql:

proc sql;
create table newtable as
select *, index(var1, var2) as ismatch;
quit;

What this produces, based on the above table is:


var1           var2        ismatch
CA09|CA10      CA09           0
CA06|CA09      CA09           6
CA06           CA09           0

...

2nd and 3rd row are OK. But the first row is coming up 0 despite CA09 being at the beginning of the string. I've tested it more, and it seems that the index function will only identify CA09 when it is the rightmost part of var1 (i.e., it would not find CA09 within CA06|CA09|CA11).

Yet, if I hardcode "CA09" in place of var2, the code works perfectly and finds CA09 anywhere in the string (as intended). I've also tried it in a data step and it seems to behave the same way. All examples of the INDEX function I've found use a hardcoded string, but is it possible to use a variable there?

Or is there a better way to find one string in another when both are variables?

Thank you!


Accepted Solutions
Solution
‎10-24-2012 02:11 PM
Super User
Super User
Posts: 6,497

Re: having trouble with index function using 2 variables.

Try trimming the blanks off the end of VAR2 when using the index function.

891  data have ;

892    length var1 $20 var2 $10 ;

893    input var1 var2;

894    loc1 = index(var1,var2);

895    loc2 = index(var1,trim(var2));

896    put (loc1 loc2 var1 var2) (=) ;

897  cards;

loc1=0 loc2=1 var1=CA09|CA10 var2=CA09

loc1=6 loc2=6 var1=CA06|CA09 var2=CA09

loc1=0 loc2=0 var1=CA06 var2=CA09

NOTE: The data set WORK.HAVE has 3 observations and 4 variables.

NOTE: DATA statement used (Total process time):

      real time           0.01 seconds

      cpu time            0.01 seconds

View solution in original post


All Replies
Solution
‎10-24-2012 02:11 PM
Super User
Super User
Posts: 6,497

Re: having trouble with index function using 2 variables.

Try trimming the blanks off the end of VAR2 when using the index function.

891  data have ;

892    length var1 $20 var2 $10 ;

893    input var1 var2;

894    loc1 = index(var1,var2);

895    loc2 = index(var1,trim(var2));

896    put (loc1 loc2 var1 var2) (=) ;

897  cards;

loc1=0 loc2=1 var1=CA09|CA10 var2=CA09

loc1=6 loc2=6 var1=CA06|CA09 var2=CA09

loc1=0 loc2=0 var1=CA06 var2=CA09

NOTE: The data set WORK.HAVE has 3 observations and 4 variables.

NOTE: DATA statement used (Total process time):

      real time           0.01 seconds

      cpu time            0.01 seconds

Contributor
Posts: 38

Re: having trouble with index function using 2 variables.

Thank you Tom!!

Contributor
Posts: 38

Re: having trouble with index function using 2 variables.

Tom, if you wouldn't mind, I'm trying to understand this. Clearly it works. But when I create another variable in proc sql,

LENGTH(var2) as lenvar

It returns 4 for each of the VAR2 values (all of them look like CA09) which makes sense.

But if TRIM is removing something, what can possibly be removed if the LENGTH of CA09 was already 4?

Thanks,

Dave

Super User
Super User
Posts: 6,497

Re: having trouble with index function using 2 variables.

All character variables in SAS are fixed length fields.  The LENGTH() function is showing you the length of the trimmed value, not the defined length of the variable.  To see the defined length of the variable run PROC CONTENTS on your dataset  (or use the DESCRIBE TABLE statement in SQL).  You seem to be more familiar with SQL syntax so to define the length of variable created by a SQL select statement you can add the LENGTH keyword.

create table new as

   select 'A'  as newvar length=5

        , *

   from sashelp.class

;

For most operations SAS will automatically trim the trailing spaces.  INDEX() is an exception where the spaces are preserved.

For example:

929  data _null_;

930    if 'A   ' = 'A' then put 'yes';

931  run;

yes



Contributor
Posts: 38

Re: having trouble with index function using 2 variables.

Thank you again sir!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 201 views
  • 0 likes
  • 2 in conversation