- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I am wondering if this is possible - and if so how? :
I have a dataset called test - it contains values like:
COL1 COL2 COL3
ABB Test why OQSJ
AIW This is ex QSDJ
AIQ Why Test KAJL
Questions:
1) Is there any way in sas to search COL2 for the word "Test" and then output a new dataset that only contains the rows that had the word "Test" somewhere in COL2.
2) Is there any way in sas for me to search specific sentences in COL2? Example - I want sas to search for "This is ex" and "Why Test" and output a new dataset that only contains rows where these two are true.
Please post sample code if possible. Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi.
Please see below code.
*read in your data;
data test;
input col1 $1-4 col2 $ 5-14 col3 $ 15-18;
cards;
ABB Test why OQSJ
AIW This is ex QSDJ
AIQ Why Test KAJL
;
run;
*only keep the records that have in COL2 the word "test";
data want_1;
set test (where= (prxmatch("m/test/oi",col2) > 0));
run;
*only keep the records that have "why test' or "this is ex";
data want_2;
set test (where= (prxmatch("m/Why test|this is ex/oi",col2) > 0));
run;
I hope it helps.
You can also run the data once;
data want_1 want_2;
set test;
if (prxmatch("m/test/oi",col2) > 0)) then output want_1;
if (prxmatch("m/Why test|this is ex/oi",col2) > 0)) then output want_2; the two data
run;
Good luck.
Anca.
I collapsed the two data steps into one.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi.
Please see below code.
*read in your data;
data test;
input col1 $1-4 col2 $ 5-14 col3 $ 15-18;
cards;
ABB Test why OQSJ
AIW This is ex QSDJ
AIQ Why Test KAJL
;
run;
*only keep the records that have in COL2 the word "test";
data want_1;
set test (where= (prxmatch("m/test/oi",col2) > 0));
run;
*only keep the records that have "why test' or "this is ex";
data want_2;
set test (where= (prxmatch("m/Why test|this is ex/oi",col2) > 0));
run;
I hope it helps.
You can also run the data once;
data want_1 want_2;
set test;
if (prxmatch("m/test/oi",col2) > 0)) then output want_1;
if (prxmatch("m/Why test|this is ex/oi",col2) > 0)) then output want_2; the two data
run;
Good luck.
Anca.
I collapsed the two data steps into one.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@anca thanks for that, I will try it.
Could you please explain this part? - m/test/oi - what does the m and the oi do?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Vomer,
The 'm' tag at the beginning of the search string tells PRXMATCH that it is doing a matching operation, this is the default. So you could delete it.
The 'o' tag at the end tells SAS to compile the parse string once. This is also the default because the parse string is a constant. So you can take this one off, too.
The 'i' tag at the end forces a case insensitive match so that "THIS" is equal to "this" for the purpose of matching. If you want case sensitive you use "-i"
See more on PRXMATCH here:
38719 - Use PRXMATCH in place of multiple INDEX functions
Good luck,
Anca.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks! that is really interesting. I will read up on it as well.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You can also look up the FIND and INDEX function in SAS.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc sql; create table one as (select col1,col2,col3 from hello where col2 like '%Test%' ); create table two as (select col1,col2,col3 from hello where ((col2 like 'This is ex%') or (col2 like 'Why Test%')) ); run;