DATA Step, Macro, Functions and more

Lookback and Look ahead in character variable

Reply
Contributor
Posts: 44

Lookback and Look ahead in character variable

Hi All,

I have a character variable in my dataset and i need to create the macro which will pass some value as parameter then in that variable, i need to lookback to find first occurance of keyword called "This" and then go and find the last occurance of keyword "mortgage" whatever is between these two key variables , i need to output in another dataset.

Input

data have;

column $30.;

datalines;

welcome

to

my

system

this

is

new

variable

and

its called

delq_status

in

mortgage

and

we

are assigning

it to zero

;

run;

What I want in output is

Want

Output (column name)

This

is

new

variable

and

its called

delq_status

in

mortgage

Start point and end point (This and Mortgage ) is always common but we need to be case insensitive and there has to be exact match.

I am trying really hard and think that regular expressions can be used for such scenarios but somehow not getting it right

Please please please suggest ideas

Contributor dkb
Contributor
Posts: 53

Re: Lookback and Look ahead in character variable

Posted in reply to yashpande

Do not use a macro.

data want;

length column $ 30;

keep column;

retain write 'N';

set have;

if upcase(column) = 'THIS' then write = 'Y';

if write = 'Y' then output;

if upcase(column) = 'MORTGAGE' then write = 'N';

run;

Contributor
Posts: 44

Re: Lookback and Look ahead in character variable

Hi DKB and slchen,

If the keyword "this" appears in between then the output is not getting generated correctly. 

Lets say below input , where this is replaced by " and this ". Its getting empty results.

I have tried using contains  but its not working

data have;

input column $30.;

datalines;

welcome

to

my

system

and this

is

new

variable

and

its called

delq_status

in

mortgage

and

we

are assigning

it to zero

;

run;

Super User
Super User
Posts: 7,977

Re: Lookback and Look ahead in character variable

Posted in reply to yashpande

You could change it to index(column,"this")>0

That would find it even if there is other text.  However what is it your trying to do, whay have you split a comment field into sometimes seperate words, sometimes not?  What happens for tHis, THIs, this mortgage is this etc.

If you used one variable:

data have;

  column="welcome to my system and this is new variable and its called delq_status in mortgage and we are assigning it to zero";

  extract=substr(column,index(column,"this"),(index(column,"mortgage") - index(column,"this"))+8);

run;

Super Contributor
Posts: 275

Re: Lookback and Look ahead in character variable

Posted in reply to yashpande

My code still work if 'this' replace with 'and this'.

Super Contributor
Posts: 275

Re: Lookback and Look ahead in character variable

Posted in reply to yashpande

data temp(keep=_string);

     set have end=last;

     length string $200.;

     retain string;

    string=catx(' ',string,column);

     if last then do;

     _string=prxchange('s/.*(this .* mortgage).*/$1/i',-1,string);

     output;

end;

run;

data want;

      set temp;

      num=countw(_string);

      do i=1 to num;

          col=scan(_string,i);

          output;

     end;

     keep col;

run;

Super User
Posts: 10,041

Re: Lookback and Look ahead in character variable

Posted in reply to yashpande

Code: Program



data have;
input column $30.;
datalines;
welcome
to
my
system
this
is
new
variable
and
its called
delq_status
in
mortgage
and
we
are assigning
it to zero
;
run;
data temp;
set have;
retain a;
n+1;
if column='this' then a=1;
run;
proc sort data=temp;by descending n;run;
data want;
set temp;
retain b;
if column='mortgage' then b=1;
if a and b;
drop a b;
run;
proc sort data=want;by n;run;

Xia Keshan

Ask a Question
Discussion stats
  • 6 replies
  • 427 views
  • 0 likes
  • 5 in conversation