From my understanding these three statements should be equivalent
CATX('', COL1, COL2, COL3)
TRIM(COL1) || TRIM(COL2) || TRIM(COL3)
PRXCHANGE('s/\s+//', -1, COL1) || PRXCHANGE('s/\s+//', -1, COL2) || PRXCHANGE('s/\s+//', -1, COL3)
COL1, COL2, and COL3 are characters.
I check for whitespace with:
PRXCHANGE('s/\s/X/', -1, CONCATENATED_COLUMN)
1 and 3 produce equivalent columns but 2 does not. All three produce trailing whitespace. I would like to check this column for characters like:
CASE WHEN PRXMATCH('/^0+$/', CONCATENATED_COLUMN) THEN 1 ELSE 0 END AS FOO
But can't because of the whitespace that still appears. Must I always account for whitespace whenever I do any operation like this? Something like this:
CASE WHEN PRXMATCH('/^0+(?:\s*)$/', CONCATENATED_COLUMN) THEN 1 ELSE 0 END AS FOO
Or:
JOIN ON PRXCHANGE('s/\s+//', -1, TABLE_A.ID) = PRXCHANGE('s/\s+//', -1, TABLE_B.ID)
Because I can never be sure CATX or TRIM did their job?
Replace TRIM with STRIP and what does that do?
Same result as TRIM.
Hi: I don't understand why you are even using CATX. That function assumes you want to introduce some character between every argument. Personally, if you want there NOT to be ANY whitespace or blank between the arguments, I would use CATT instead of CATX.
Cynthia
CATT eliminates only trailing whitespace according to:
http://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=n16y1vyi397p84n19dpvzw0pemsa.htm&...
CATX eliminates trailing and leading whitespace according to:
http://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=n0p7wxtk0hvn83n1pveisbcp2ae9.htm&...
That was my rationale for choosing CATX with a blank delimiter.
My tentative solution to this is:
PRXCHANGE('s/\s+//', -1, (COL1 || COL2 || COL3))
You told CATX() to insert a space between the values. If you just place two quotes next to each other then you are telling SAS you want a blank string of length one. You can generate an empty string using the TRIMN() function.
Don't use a function, catx(), designed to insert something between the values when you don't actually want something inserted. Use CATT() or CATS().
Also what does \s mean in regular expressions? Is it only spaces? or does it include other "white space"?
A 0-length character does not exist in SAS? '' actually means ' '?
According to the PERL regular expression manual:
\s
means the five characters[ \f\n\r\t]
https://perldoc.perl.org/perlre.html
For normal operations what would you do with a zero length character string? If you assign it to an actual variable it will result in at least one space anyway. SAS stores character variables as fixed length and fills (pads) with spaces. It ignores trailing spaces when comparing strings.
If you want to eliminate trailing/leading characters other than spaces then you will need to use regex.
Or convert them to spaces and then use strip().
data _null_;
length var1 var2 $10 ;
var1='XYZ';
var2=cat(' ','09'x,'XYZ','09'x,' ','A0'x);
put (2*var1 2*var2) (=$quote. / =$hex./);
if var1 ne var2 then put 'NOT EQUAL';
if var1 = strip(translate(var2,' ','090A0D00A0'x)) then put 'EQUAL';
run;
The first 6 look like normal control characters. Not sure what the others are used for.
The set of characters that are deemed whitespace are those that Unicode calls "Pattern White Space", namely:
- U+0009 CHARACTER TABULATION
- U+000A LINE FEED
- U+000B LINE TABULATION
- U+000C FORM FEED
- U+000D CARRIAGE RETURN
- U+0020 SPACE
- U+0085 NEXT LINE
- U+200E LEFT-TO-RIGHT MARK
- U+200F RIGHT-TO-LEFT MARK
- U+2028 LINE SEPARATOR
- U+2029 PARAGRAPH SEPARATOR
It never hurts to provide a concrete example involving actual values and what the result is supposed to be at the end.
Showing what does not work does not really tell us what you want to accomplish.
@tomcmacdonald wrote:
From my understanding these three statements should be equivalent
CATX('', COL1, COL2, COL3)
TRIM(COL1) || TRIM(COL2) || TRIM(COL3)
PRXCHANGE('s/\s+//', -1, COL1) || PRXCHANGE('s/\s+//', -1, COL2) || PRXCHANGE('s/\s+//', -1, COL3)
What makes you think they should be equivalent?
For one thing you can get different default lengths.
From the documentation in CATX:
In a DATA step, if the CATX function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes. If the concatenation operator (||) returns a value to a variable that has not previously been assigned a length, then that variable is given a length that is the sum of the lengths of the values that are being concatenated.
Throw Prxchange in with || and I am not going to guess what the overall default length for the result would be.
Strip or Trim(left(var)) would be much closer to cats.
Trim removes trailing blanks and returns one blank if the string is missing. So not the equivalent of using Cats or catx in either form.
data example; a='a '; b=' '; c=' c'; abc = trim(a)||trim(b)||trim(c); hij = strip(a)||strip(b)||strip(c); xyz = cats(a,b,c); pdq = catx('',a,b,c); run;
Well, if you are worried about leading blanks, then use CATS...or, just do what we used to do before the CAT family of functions -- the original way is shown below -- using TRIM(LEFT(varname)) with the || operator or just using LEFT with CATT:
Cynthia
Any two of the three expressions are not equivalent to each other.
The CATX function is equivalent to strip(COL1)||" "||strip(COL2)||" "||strip(COL3), provided that a single blank character was defined to be the delimiter in the original example.
There are no functions equivalent to the second expression, although the closest one whould be CATT function, which is actually equivalent to trimn(COL1)||trimn(COL2)||trimn(COL3).
The third expression is equivalent to compress(COL1)||compress(COL2)||compress(COL3), resulting that all the space characters are removed, not just leading and trailing blanks.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.