BookmarkSubscribeRSS Feed
attjooo
Calcite | Level 5

The character variabel CHARVAR have values mostly like:

0178 Adam

0245 Barbara

But there are some values with some leading blanks like:

^^0791 Cathy

^^0437 David

I want to get rid of the blanks.

STRIP(CHARVAR) and LEFT(CHARVAR) doesn´t work.

Is there any special problem with these functions, when the first character after the blanks is a zero?

The data are imported from Excel. I am using SAS Enterprise Guide.6.1.

12 REPLIES 12
Patrick
Opal | Level 21

No, there is no such problem. Are you sure you're dealing with blanks or could it be that these are other non-printable white-space characters?

If so then the most efficient way I can think of to get rid of any leading white-space characters would be using a regular expression as done in below code for "want2".

data sample;

  length have $20;

  have='09'x||'0791 Cathy';

  want1=strip(have);

  want2=prxchange('s/^\s*//o',1,have);

  put have= $hex.;

  put want1= $hex.;

  put want2= $hex.;

run;

Message was edited by: Patrick Matter fixed the hex format

attjooo
Calcite | Level 5

Thanks for your answer Patrick.

I have some problems interpreting 's/^\s*//o'. Can you elaborate on that, please.

And shouldn't the "1" in prxchange('s/^\s*//o',1,have) be "-1" if the number of non-printable characters are unknown?

I have no idea of what the possible non-printable characters could be. I used "^" above to indicate blanks.

Patrick
Opal | Level 21

Use "put <your variable>= $hex.;" in your code. This will show you in the log the hex codes so you can find out what these white-space characters really are.

As for the RegEx ^\s*

^     at the beginning of the string

\s     any white-space character

*     zero, one or many

And because this RegEx can match only once at the beginning of the string, a '1' should be sufficient.

Message was edited by: Patrick Matter fixed the hex format

attjooo
Calcite | Level 5

You wrote: "Use "put <your variable>= hex.;" in your code.".

This gives:

^^0791 Cathy .

^^0437 David .

in the log, and here my ^^ stands for blanks.

This should indicate that the white-space characters really are blanks. It is very strange then, that the strip function doesn't work.

Where the dots come from I don't know.

data_null__
Jade | Level 19

The DOT come from you leaving off the period after the format HEX in your PUT statement.  It should be $HEX. and you may want to include W as in $HEX8.;

attjooo wrote:

Where the dots come from I don't know.

Tom
Super User Tom
Super User

You should have values like this is you really used the $HEX. format.  Only the '2020' are spaces. The others are other binary codes that might look like spaces when printed.

94   data _null_;

95    x='  0791 Cathy';

96    put x= $hex. ;

97    x='0000'x || '0791 Cathy';

98    put x= $hex.;

99    x='A020'x || '0791 Cathy';

100   put x= $hex.;

101   x='0909'x || '0791 Cathy';

102   put x= $hex.;

103

104  run;

x=202030373931204361746879

x=000030373931204361746879

x=A02030373931204361746879

x=090930373931204361746879

attjooo
Calcite | Level 5

My data above was fake data.

Using $hex. the problematic variable values all starts with

A0203036

and then 34 more digits.

How should I interprete this?

¨'

Tom
Super User Tom
Super User

'A0'x is character that some (particularly Microsoft products) use as a "non-breaking" space.  Convert them to a real space and then STRIP() or LEFT() should work for you.

CHARVAR = left(translate(CHARVAR,' ','A0'x));

attjooo
Calcite | Level 5

Thank you Tom. My problem is now solved.

Patrick
Opal | Level 21

Then please mark the most suitable answer as "correct" and answers which were helpful as "helpful".

attjooo
Calcite | Level 5

I was looking for a button marked "Correct" to press, but I couldn't find any.

Astounding
PROC Star

If the legitimate characters always begin with a "0", then you shouldn't have to figure out what the extra character is.  You could try:

length first $ 1;

first = charvar;

if first ne '0' then charvar = comrpess(charvar, first);

If the first legitimate character might be a nonzero, the problem gets more complex but COMPRESS can probably handle it.  There is a third parameter for COMPRESS, for example, that would let you keep all letters and digits and discard everything else.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 6462 views
  • 0 likes
  • 5 in conversation