Hi all,
need to extract 'ABCD' from the below sample data
text1 | text2 | text3 | text4 | text5 |
MARIANE BAFILE | ACF ASHLEY NOZIK NYS ABCD | MARIANE BAFILE | ||
MARIANE BAFILE | ACF AMANDA NOZIK NYS ABCD | MARIANE BAFILE | ||
MARTHA REITMAN ACF | LAURA REITMAN U/CA/ABCD #2 | LAURA REITMAN | ||
LORNA KENNEDY - PLOWRIGHT | ABCD KAVOY LINDSAY PLOWRIGHT | |||
RANDALL CRAIG EX | CUSTODIAN FOR | SHERIDAN C EX | UNDER ABCD PMB321 | RANDALL CRAIG EX |
SANDRA J PUTMAN | SANDRA J PUTMAN | |||
AARON GUTMAN | AARON GUTMAN |
What does "extract" mean?
Are you asking to check for observations that have the string "ABCD" in any of this variables you show in your printout?
What about observations that have "ABCDE"? Should those be included?
data want;
set have;
where findw("ABCD",cats(' ',text1,text2,text3,text4,text5));
run;
NO, I need 'ABCD' , it may occur after / ,# or space.
expected output:
texta | textb | textc | textd | texte |
ABCD | ||||
ABCD | ||||
ABCD | ||||
ABCD | ||||
ABCD |
You need to explain in more detail what you are trying to do. You cannot write code to extract a random word from a string (well you could but the result would be just a random word).
What is the rule that explains why the text ABCD should be pulled from those input strings. Why not any of the other words in the string. Does the ABCD change? Or are you literally looking for ABCD?
its pattern,i need only that ,it may occur with delimiters like /,space,#.
it may occur in middle of the name with space delimiter..but idont need that string if its part of name..for eg.,
text1:forabcd.. idont require to extract..
textt1: forage abcd felix .. i have to extract abcd pattern
@Cho8 wrote:
its pattern,i need only that ,it may occur with delimiters like /,space,#.
it may occur in middle of the name with space delimiter..but idont need that string if its part of name..for eg.,
text1:forabcd.. idont require to extract..
textt1: forage abcd felix .. i have to extract abcd pattern
Why are you using the word extract? What is it that you want to extract?
abcd is NOT a pattern. It is just a string of four letters. A pattern would be something like four letters. Then strings like ABCD or WXYZ would match that pattern.
If you just want to detect if the string ABCD appears in a variable you can use the FINDW() or INDEXW() function.
If you actually do have a pattern then perhaps you want to use regular expressions to search for them. But before you can write the regular expression you need say what pattern you are looking for.
data want;
set have;
array word_search(5) text1-text5;
array word_find(5) texta textb textc textd texte;
do i=1 to dim(word_search);
if findw(word_search(i), 'ABCD', 'it')>0 then word_find(i) = 'ABCD';
end;
run;
Suspect this will not generalize but based on what you've posted this would meet the requirements.
@Cho8 wrote:
Hi all,
need to extract 'ABCD' from the below sample data
text1 text2 text3 text4 text5 MARIANE BAFILE ACF ASHLEY NOZIK NYS ABCD MARIANE BAFILE MARIANE BAFILE ACF AMANDA NOZIK NYS ABCD MARIANE BAFILE MARTHA REITMAN ACF LAURA REITMAN U/CA/ABCD #2 LAURA REITMAN LORNA KENNEDY - PLOWRIGHT ABCD KAVOY LINDSAY PLOWRIGHT RANDALL CRAIG EX CUSTODIAN FOR SHERIDAN C EX UNDER ABCD PMB321 RANDALL CRAIG EX SANDRA J PUTMAN SANDRA J PUTMAN AARON GUTMAN AARON GUTMAN
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.