BookmarkSubscribeRSS Feed
Cho8
Calcite | Level 5

Hi all,

 

need to extract 'ABCD' from the below sample data

 

text1text2text3text4text5
MARIANE BAFILEACF ASHLEY NOZIK NYS ABCD  MARIANE BAFILE
MARIANE BAFILEACF AMANDA NOZIK NYS ABCD  MARIANE BAFILE
MARTHA REITMAN ACFLAURA REITMAN U/CA/ABCD  #2  LAURA REITMAN
LORNA KENNEDY - PLOWRIGHTABCD KAVOY LINDSAY PLOWRIGHT   
RANDALL CRAIG EXCUSTODIAN FORSHERIDAN C EXUNDER ABCD PMB321RANDALL CRAIG EX
SANDRA J PUTMAN   SANDRA J PUTMAN
AARON GUTMAN   AARON GUTMAN
16 REPLIES 16
Tom
Super User Tom
Super User

What does "extract" mean?
Are you asking to check for observations that have the string "ABCD" in any of this variables you show in your printout?

What about observations that have "ABCDE"? Should those be included?

data want;
  set have;
  where findw("ABCD",cats(' ',text1,text2,text3,text4,text5));
run;
Cho8
Calcite | Level 5

NO, I need 'ABCD' , it may occur after / ,# or space.

Reeza
Super User
If the first table is your expected input data set, show the expected output data set you want after extraction.
Cho8
Calcite | Level 5

expected output:

 

 

textatextbtextctextdtexte
 ABCD   
 ABCD   
 ABCD   
 ABCD   
   ABCD 
Tom
Super User Tom
Super User

You need to explain in more detail what you are trying to do.  You cannot write code to extract a random word from a string (well you could but the result would be just a random word).

 

What is the rule that explains why the text ABCD should be pulled from those input strings.  Why not any of the other words in the string.  Does the ABCD change? Or are you literally looking for ABCD?

Cho8
Calcite | Level 5

its pattern,i need only that ,it may occur with delimiters like  /,space,#.

 

it may occur in middle of the name with space delimiter..but idont need that string if its part of name..for eg.,

 

text1:forabcd.. idont require to extract..

textt1: forage abcd felix .. i have to extract abcd pattern

Reeza
Super User
If the posted solution does not work, show how with examples and it can be modified.

I modified the FIND to FINDW to search for words in the string.
Tom
Super User Tom
Super User

@Cho8 wrote:

its pattern,i need only that ,it may occur with delimiters like  /,space,#.

 

it may occur in middle of the name with space delimiter..but idont need that string if its part of name..for eg.,

 

text1:forabcd.. idont require to extract..

textt1: forage abcd felix .. i have to extract abcd pattern


Why are you using the word extract?  What is it that you want to extract? 

abcd is NOT a pattern.  It is just a string of four letters.  A pattern would be something like four letters.  Then strings like ABCD or WXYZ would match that pattern.

 

If you just want to detect if the string ABCD appears in a variable you can use the FINDW() or INDEXW() function.

 

If you actually do have a pattern then perhaps you want to use regular expressions to search for them.  But before you can write the regular expression you need say what pattern you are looking for.

 

 

Cho8
Calcite | Level 5
data test_abcd;
set abcd;
test1=scan(acct_1,1," ""/","b");
test2=scan(acct_2,1," ""/","b");
test3=scan(acct_3,1," ""/","b");
test4=scan(acct_4,1," ""/","b");
test5=scan(acct_5,1," ""/","b");

run;


proc sql;
create table test2_abcd as select *,
(case when test1 like '%ABCD%' or test2 like '%ABCD%'
or test3 like '%ABCD%' or test4 like '%ABCD%'
or test5 like '%ABCD%' then 'Y' else 'N' end)as IND
from WORK.TEST_ABCD ;
quit;
Cho8
Calcite | Level 5
in the above code, Iam not able to scan the whole string for the pattern, its only scanning last word.
Quentin
Super User
Did you try @Reeza's array solution? If so, what did you not like about that approach?
Reeza
Super User
data want;
set have;

array word_search(5) text1-text5;
array word_find(5) texta textb textc textd texte;

do i=1 to dim(word_search);
if findw(word_search(i), 'ABCD', 'it')>0 then word_find(i) = 'ABCD';
end;

run;

Suspect this will not generalize but based on what you've posted this would meet the requirements.

 


@Cho8 wrote:

Hi all,

 

need to extract 'ABCD' from the below sample data

 

text1 text2 text3 text4 text5
MARIANE BAFILE ACF ASHLEY NOZIK NYS ABCD     MARIANE BAFILE
MARIANE BAFILE ACF AMANDA NOZIK NYS ABCD     MARIANE BAFILE
MARTHA REITMAN ACF LAURA REITMAN U/CA/ABCD  #2     LAURA REITMAN
LORNA KENNEDY - PLOWRIGHT ABCD KAVOY LINDSAY PLOWRIGHT      
RANDALL CRAIG EX CUSTODIAN FOR SHERIDAN C EX UNDER ABCD PMB321 RANDALL CRAIG EX
SANDRA J PUTMAN       SANDRA J PUTMAN
AARON GUTMAN       AARON GUTMAN

 

Cho8
Calcite | Level 5
its throwing syntax error @findw.
Quentin
Super User
Please show the log with the code and the error message.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 16 replies
  • 3204 views
  • 1 like
  • 4 in conversation