Hi all,
I am wondering if this is possible - and if so how? :
I have a dataset called test - it contains values like:
COL1 COL2 COL3
ABB Test why OQSJ
AIW This is ex QSDJ
AIQ Why Test KAJL
Questions:
1) Is there any way in sas to search COL2 for the word "Test" and then output a new dataset that only contains the rows that had the word "Test" somewhere in COL2.
2) Is there any way in sas for me to search specific sentences in COL2? Example - I want sas to search for "This is ex" and "Why Test" and output a new dataset that only contains rows where these two are true.
Please post sample code if possible. Thanks!
Hi.
Please see below code.
*read in your data;
data test;
input col1 $1-4 col2 $ 5-14 col3 $ 15-18;
cards;
ABB Test why OQSJ
AIW This is ex QSDJ
AIQ Why Test KAJL
;
run;
*only keep the records that have in COL2 the word "test";
data want_1;
set test (where= (prxmatch("m/test/oi",col2) > 0));
run;
*only keep the records that have "why test' or "this is ex";
data want_2;
set test (where= (prxmatch("m/Why test|this is ex/oi",col2) > 0));
run;
I hope it helps.
You can also run the data once;
data want_1 want_2;
set test;
if (prxmatch("m/test/oi",col2) > 0)) then output want_1;
if (prxmatch("m/Why test|this is ex/oi",col2) > 0)) then output want_2; the two data
run;
Good luck.
Anca.
I collapsed the two data steps into one.
Hi.
Please see below code.
*read in your data;
data test;
input col1 $1-4 col2 $ 5-14 col3 $ 15-18;
cards;
ABB Test why OQSJ
AIW This is ex QSDJ
AIQ Why Test KAJL
;
run;
*only keep the records that have in COL2 the word "test";
data want_1;
set test (where= (prxmatch("m/test/oi",col2) > 0));
run;
*only keep the records that have "why test' or "this is ex";
data want_2;
set test (where= (prxmatch("m/Why test|this is ex/oi",col2) > 0));
run;
I hope it helps.
You can also run the data once;
data want_1 want_2;
set test;
if (prxmatch("m/test/oi",col2) > 0)) then output want_1;
if (prxmatch("m/Why test|this is ex/oi",col2) > 0)) then output want_2; the two data
run;
Good luck.
Anca.
I collapsed the two data steps into one.
@anca thanks for that, I will try it.
Could you please explain this part? - m/test/oi - what does the m and the oi do?
Vomer,
The 'm' tag at the beginning of the search string tells PRXMATCH that it is doing a matching operation, this is the default. So you could delete it.
The 'o' tag at the end tells SAS to compile the parse string once. This is also the default because the parse string is a constant. So you can take this one off, too.
The 'i' tag at the end forces a case insensitive match so that "THIS" is equal to "this" for the purpose of matching. If you want case sensitive you use "-i"
See more on PRXMATCH here:
38719 - Use PRXMATCH in place of multiple INDEX functions
Good luck,
Anca.
Thanks! that is really interesting. I will read up on it as well.
You can also look up the FIND and INDEX function in SAS.
proc sql; create table one as (select col1,col2,col3 from hello where col2 like '%Test%' ); create table two as (select col1,col2,col3 from hello where ((col2 like 'This is ex%') or (col2 like 'Why Test%')) ); run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.