BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tellmeaboutityo
Obsidian | Level 7

Hi,


I'm trying to sort out certain instances of a character variable.  The data in the variable is in the form either: 

 

data_1

dataI_1

 

The difference being the roman numeral "I' after data.  The numbers "1" can vary.  I want only data_, and to drop instances of dataI_.  Thus:

 

proc sql; create table info

as select *

from data

where var like 'data_%'

;quit;

 

Surprisingly, this does not drop instances of dataI_

 

Can anyone recommend a fix?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Underscore is a special character to the LIKE operator used to match any single character. So your query will match any string of five or more characters that start with 'date'.

 

You will need to escape it if you want LIKE to treat it as a literal underscore.

var like 'data^_%' escape '^'

 

View solution in original post

5 REPLIES 5
HB
Barite | Level 11 HB
Barite | Level 11

I suspect this won't work with your real data.

 

data stuff;
   input myvar $20.;
datalines;
data_1
dataI_1
data_2
dataI_2
data)_1
;
run;


proc sql; 
	create table only_no_Is
	as select *
	from stuff
	where substr(myvar, 5, 1) not in ("I");
quit;

Yeilds

 

myvar
data_1
data_2
data)_1

prxmatch would probably be better. I would use it.

 

data no_Is;
	set stuff;
	if not prxmatch("m/I/",myvar);
run;
Tom
Super User Tom
Super User

Underscore is a special character to the LIKE operator used to match any single character. So your query will match any string of five or more characters that start with 'date'.

 

You will need to escape it if you want LIKE to treat it as a literal underscore.

var like 'data^_%' escape '^'

 

Astounding
PROC Star

If you're allowed to use a DATA step, the problem is trivial:

 

data want;

set have;

if var = : 'datal_' then delete;

run;

mkeintz
PROC Star

IN the data step there is the '=:' operator, which first shortens the longer operand to lenght of the shorter, then compare for equality:

 

data want;
   set have;
   where   var =: 'data_';
run;

 

PROC SQL does not support '=:' but it does have EQT (and GET and LET) "truncated string comparison operators":

 

data have;
  var='dataI_1';output;
  var='data_1'; output;
run;

proc sql;
  create table info2
  as select *
  from have
  where var EQT 'data_';
quit;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
RexDeus9
Quartz | Level 8

Hi,

 

I don't have an answer for the "like" issue, but the following works fine:

 

data info;

    Set data;

    if LowCase(var) =: "data_";

run;

 

 

 

🙂

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1848 views
  • 5 likes
  • 6 in conversation