BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Behrouz
Calcite | Level 5

Hello everyone;

 

Does anybody know how to find for example only "ture" in the following string"

"screen shots or pictures of errors with ture and ..."

I tried many available functions, but non of them worked as they need location of the word. But since I'm dealing with over 100 milion observation and the location of the word in variable (column) is random, I can't get any useful results. For example, if I want to use index or Find or Contains, then I get pictures and ture, because pictures also contains ture. If I want to use SUBSTR instead then I need to know the location of the word, which I don't because the word can be found anywhere in the column.

Any help would be appreciated very much.

 

Thanks! 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

data want;
 str="screen shots or pictures of errors with ture and ...";
 position_of_the_word=findw(str,'ture',' ','e');
 extract_the_word=scan(str,findw(str,'ture',' ','e'));
run;
str position_of_the_word extract_the_word
screen shots or pictures of errors with ture and ... 8 ture

View solution in original post

14 REPLIES 14
Astounding
PROC Star

Instead of FIND, use FINDW.

novinosrin
Tourmaline | Level 20

data want;
 str="screen shots or pictures of errors with ture and ...";
 position_of_the_word=findw(str,'ture',' ','e');
 extract_the_word=scan(str,findw(str,'ture',' ','e'));
run;
str position_of_the_word extract_the_word
screen shots or pictures of errors with ture and ... 8 ture
novinosrin
Tourmaline | Level 20


data want;
 str="screen shots or pictures of errors with ture and ...";
 want=prxchange('s/.*(\bture\b).*/$1/i', -1, str);
run;
Behrouz
Calcite | Level 5

Hi;
Thanks for prompt answer. How can I tailor those codes for variable name that contains the text instead of single string? The variable name is Text and again I'm looking for "ture".

novinosrin
Tourmaline | Level 20

Hi @Behrouz  The following example should help you follow

 

/*Let's assume HAVE is your source dataset and STR is the variable name*/
data have;
 str="screen shots or pictures of errors with ture and ...";
run;
/*Read the source dataset HAVE and create the new variable*/
data want;
 set have;
 position_of_the_word=findw(str,'ture',' ','e');/*position of the word in the string*/
 extract_the_word=scan(str,findw(str,'ture',' ','e'));/*extract the word in the string*/
run;
Behrouz
Calcite | Level 5

Hi,
As I mentioned, the variable (column) name is "Text". Where do you put the variable name (Text) in your codes? and the column has over 100 million records and "screen shots or pictures of errors with ture and ..." is just one single example out of 100 million observations. I just gave this example to show you what am I trying to do. Thanks!

novinosrin
Tourmaline | Level 20

Hello Mam/Sir @ShelleySessoms / @ChrisHemedinger Something I clicked accidentally by mistake made my reply overwrite OP's prev message. I don't know how to undo. My sincere apologies for the bother. Kindly help.

ChrisHemedinger
Community Manager

@novinosrin Restored. Careful with those Super User powers!

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.
novinosrin
Tourmaline | Level 20

Thank you Sir. I will be -first and last mistake. My apologies again!

novinosrin
Tourmaline | Level 20

@Behrouz  The HAVE is the sample dataset I created to test.

 

WANT is the dataset that is our required output.

 

Extract_the_word is the new variable that we create in the WANT dataset by reading the HAVE(your input)

 

The STR variable is the TEXT variable that has the string variables, so please replace the STR with whatever name your TEXT variable contains.

novinosrin
Tourmaline | Level 20

Hello @Behrouz   I get the feeling, you are confused with the know-how of how a datastep read/write process happens.  I have edited my previous code replacing str as text.. See if you can copy paste to your real stuff properly

 

/*Let's assume HAVE is your source dataset and STR is the variable name*/
data have;
 text="screen shots or pictures of errors with ture and ...";
run;
/*Read the source dataset HAVE and create the new variable*/
data want;
 set have;
 position_of_the_word=findw(text,'ture',' ','e');/*position of the word in the string*/
 extract_the_word=scan(str,findw(text,'ture',' ','e'));/*extract the word in the string*/
run;

 

 

Behrouz
Calcite | Level 5

Hi;

I'm quite familiar with data steps. I guess my problem was with the first part of your codes:

data have;
 text="screen shots or pictures of errors with ture and ...";
run;

If you run it, you get a dataset with only one variable and one observation. So, I think this part wasn't needed. I also became confused with str as variable name. Actually, I thought it is a function. Anyway, I got what I wanted and I like to thank you with helping me out with this.  

novinosrin
Tourmaline | Level 20

Thank you @Behrouz   We're all glad. Sorry, I ran into lot of confusion as I failed to explain comprehensively. More so than techniques/concepts I am learning communication skill on the forum. 

 

Also, the HAVE part is basically to create a sample for us to test. The original poster (OP) of the question like you may or may not post a sample and the expected output clearly. This leads us to create one to test.

 

Okay, going forward, here is an approach to frame a good question-

 

1. Post a sample of your input dataset (HAVE)

2.  Post a sample of your expected output dataset (WANT)

3. Briefly explain the business/transformation logic in simple sentences and we are good to go.

 

Makes sense? 🙂

Behrouz
Calcite | Level 5

No worries. We are all learning everyday from each other. I've been using SAS for a long time, but I didn't even know there is a findw function! I felt kind of naive...But, I guess that's the beauty of life, being surprised 🙂

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 1548 views
  • 0 likes
  • 4 in conversation