Finding first and second word of the last line in a text

SIgnificatif · Posted 05-20-2019 09:32 AM

Hi All

I would like to ask you how to find first and second word ( string) in a row of text ? and use a rule if the first word is the only one in row ( there is no other text after that word ?)

I have found this code , but it's only of first and last text., thank you.

options pageno=1 nodate ls=80 ps=64;

data firstlast;
   input String $60.;
   First_Word = scan(string, 1);
   Last_Word = scan(string, -1);
   datalines4;
Jack and Jill
& Bob & Carol & Ted & Alice &
Leonardo
! $ % & ( ) * + , - . / ;
;;;;

proc print data=firstlast;
run;

PeterClemmensen · Posted 05-20-2019 09:34 AM

So what is your desired outcome of row 2 and 4 here?

The DATA to DATA Step Macro
Blog: SASnrd

SIgnificatif · Posted 05-20-2019 10:02 AM

Hi, Thank you for your message,
this code was just an example.

I would like to check for thsi text:

text text text text end of line of this text
some other text and here we are

if the text 'some other text' is the last text at the end of the last row I want to assign a value to a variable, else if there is text 'and here we are'

then i want to assign another value to the variable ( without using regular expressions)

ballardw · Posted 05-20-2019 04:15 PM

@SIgnificatif wrote:

Hi, Thank you for your message,
this code was just an example.

I would like to check for thsi text:
text text text text end of line of this text
some other text and here we are
if the text 'some other text' is the last text at the end of the last row I want to assign a value to a variable, else if there is text 'and here we are'

then i want to assign another value to the variable ( without using regular expressions)

When you arbitrarily limit possible solutions, such as your "without using regular expressions" you really need to explain why that is not a valid solution.

It is similar to saying "I want to subtract one number from another without using the - operator".

SIgnificatif · Posted 05-21-2019 06:59 AM

Hi, thank you for the question, the single answer I have is that I found re take more time compared to sas finding stuff, I agree that they may seem ( or really are ) more intuitive ..., if there is no other way, then maybe I'll use the re.

ballardw · Posted 05-21-2019 04:11 PM

@SIgnificatif wrote:
Hi, thank you for the question, the single answer I have is that I found re take more time compared to sas finding stuff, I agree that they may seem ( or really are ) more intuitive ..., if there is no other way, then maybe I'll use the re.

We seem to also have a moving target as to what you want. You started with first and second "word" but have wandered quite a ways from that.

Please read the documentation for the SCAN function to see what a "word" is.

if the text 'some other text' is referring to a phrase and Scan would be inappropriate unless you have some specific delimiter between chunks of text

You can use several functions to search for strings inside other text. INDEX and FIND come to mind. If a value of the index function is greater than zero the text was found. Since you are now playing with something besides a single word you could use the length of the "word" you are searching for along with the length of the string to see if it is the last of a line of text.

You may need to consider what you mean by a match as far as capitalization or presence of punctuation though.

Consider

data example;
   infile datalines truncover;
   input phrase $ 1-50 ;
   target='a phrase';
   last = ( (find(phrase,target,-50) + length(target) -1) = length(phrase) );
   last2 = ( (find(phrase,target,-50,'i') + length(target) -1) = length(phrase) );
datalines;
This is a phrase
This is a different phrase
a phrase at the begining
THIS IS A PHRASE
;

A value of 1 for last or last2 means that the phrase was found at the "end" of the string. Notice that the difference between the two is the capitalization of "A PHRASE" and how the Find function modifier 'i' ignores case. You may potentially want to look at the T modifier

The -50 uses the length of the string as the start position and then the negative says to search backwards from that position.

If your search bit includes leading or trailing blanks. The FINDW function has some additional options to consider.

Finding first and second word of the last line in a text

Re: Finding first and second word of the last line in a text

Re: Finding first and second word of the last line in a text

Re: Finding first and second word of the last line in a text

Re: Finding first and second word of the last line in a text

Re: Finding first and second word of the last line in a text

Finding first and second word of the last line in a text

Re: Finding first and second word of the last line in a text

Re: Finding first and second word of the last line in a text

Re: Finding first and second word of the last line in a text

Re: Finding first and second word of the last line in a text

Re: Finding first and second word of the last line in a text

The 2025 SAS Hackathon has begun!