The character variabel CHARVAR have values mostly like:
0178 Adam
0245 Barbara
But there are some values with some leading blanks like:
^^0791 Cathy
^^0437 David
I want to get rid of the blanks.
STRIP(CHARVAR) and LEFT(CHARVAR) doesn´t work.
Is there any special problem with these functions, when the first character after the blanks is a zero?
The data are imported from Excel. I am using SAS Enterprise Guide.6.1.
No, there is no such problem. Are you sure you're dealing with blanks or could it be that these are other non-printable white-space characters?
If so then the most efficient way I can think of to get rid of any leading white-space characters would be using a regular expression as done in below code for "want2".
data sample;
length have $20;
have='09'x||'0791 Cathy';
want1=strip(have);
want2=prxchange('s/^\s*//o',1,have);
put have= $hex.;
put want1= $hex.;
put want2= $hex.;
run;
Message was edited by: Patrick Matter fixed the hex format
Thanks for your answer Patrick.
I have some problems interpreting 's/^\s*//o'. Can you elaborate on that, please.
And shouldn't the "1" in prxchange('s/^\s*//o',1,have) be "-1" if the number of non-printable characters are unknown?
I have no idea of what the possible non-printable characters could be. I used "^" above to indicate blanks.
Use "put <your variable>= $hex.;" in your code. This will show you in the log the hex codes so you can find out what these white-space characters really are.
As for the RegEx ^\s*
^ at the beginning of the string
\s any white-space character
* zero, one or many
And because this RegEx can match only once at the beginning of the string, a '1' should be sufficient.
Message was edited by: Patrick Matter fixed the hex format
You wrote: "Use "put <your variable>= hex.;" in your code.".
This gives:
^^0791 Cathy .
^^0437 David .
in the log, and here my ^^ stands for blanks.
This should indicate that the white-space characters really are blanks. It is very strange then, that the strip function doesn't work.
Where the dots come from I don't know.
The DOT come from you leaving off the period after the format HEX in your PUT statement. It should be $HEX. and you may want to include W as in $HEX8.;
attjooo wrote:
Where the dots come from I don't know.
You should have values like this is you really used the $HEX. format. Only the '2020' are spaces. The others are other binary codes that might look like spaces when printed.
94 data _null_;
95 x=' 0791 Cathy';
96 put x= $hex. ;
97 x='0000'x || '0791 Cathy';
98 put x= $hex.;
99 x='A020'x || '0791 Cathy';
100 put x= $hex.;
101 x='0909'x || '0791 Cathy';
102 put x= $hex.;
103
104 run;
x=202030373931204361746879
x=000030373931204361746879
x=A02030373931204361746879
x=090930373931204361746879
My data above was fake data.
Using $hex. the problematic variable values all starts with
A0203036
and then 34 more digits.
How should I interprete this?
¨'
'A0'x is character that some (particularly Microsoft products) use as a "non-breaking" space. Convert them to a real space and then STRIP() or LEFT() should work for you.
CHARVAR = left(translate(CHARVAR,' ','A0'x));
Thank you Tom. My problem is now solved.
Then please mark the most suitable answer as "correct" and answers which were helpful as "helpful".
I was looking for a button marked "Correct" to press, but I couldn't find any.
If the legitimate characters always begin with a "0", then you shouldn't have to figure out what the extra character is. You could try:
length first $ 1;
first = charvar;
if first ne '0' then charvar = comrpess(charvar, first);
If the first legitimate character might be a nonzero, the problem gets more complex but COMPRESS can probably handle it. There is a third parameter for COMPRESS, for example, that would let you keep all letters and digits and discard everything else.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.