BookmarkSubscribeRSS Feed
attjooo
Calcite | Level 5

The character variabel CHARVAR have values mostly like:

0178 Adam

0245 Barbara

But there are some values with some leading blanks like:

^^0791 Cathy

^^0437 David

I want to get rid of the blanks.

STRIP(CHARVAR) and LEFT(CHARVAR) doesn´t work.

Is there any special problem with these functions, when the first character after the blanks is a zero?

The data are imported from Excel. I am using SAS Enterprise Guide.6.1.

12 REPLIES 12
Patrick
Opal | Level 21

No, there is no such problem. Are you sure you're dealing with blanks or could it be that these are other non-printable white-space characters?

If so then the most efficient way I can think of to get rid of any leading white-space characters would be using a regular expression as done in below code for "want2".

data sample;

  length have $20;

  have='09'x||'0791 Cathy';

  want1=strip(have);

  want2=prxchange('s/^\s*//o',1,have);

  put have= $hex.;

  put want1= $hex.;

  put want2= $hex.;

run;

Message was edited by: Patrick Matter fixed the hex format

attjooo
Calcite | Level 5

Thanks for your answer Patrick.

I have some problems interpreting 's/^\s*//o'. Can you elaborate on that, please.

And shouldn't the "1" in prxchange('s/^\s*//o',1,have) be "-1" if the number of non-printable characters are unknown?

I have no idea of what the possible non-printable characters could be. I used "^" above to indicate blanks.

Patrick
Opal | Level 21

Use "put <your variable>= $hex.;" in your code. This will show you in the log the hex codes so you can find out what these white-space characters really are.

As for the RegEx ^\s*

^     at the beginning of the string

\s     any white-space character

*     zero, one or many

And because this RegEx can match only once at the beginning of the string, a '1' should be sufficient.

Message was edited by: Patrick Matter fixed the hex format

attjooo
Calcite | Level 5

You wrote: "Use "put <your variable>= hex.;" in your code.".

This gives:

^^0791 Cathy .

^^0437 David .

in the log, and here my ^^ stands for blanks.

This should indicate that the white-space characters really are blanks. It is very strange then, that the strip function doesn't work.

Where the dots come from I don't know.

data_null__
Jade | Level 19

The DOT come from you leaving off the period after the format HEX in your PUT statement.  It should be $HEX. and you may want to include W as in $HEX8.;

attjooo wrote:

Where the dots come from I don't know.

Tom
Super User Tom
Super User

You should have values like this is you really used the $HEX. format.  Only the '2020' are spaces. The others are other binary codes that might look like spaces when printed.

94   data _null_;

95    x='  0791 Cathy';

96    put x= $hex. ;

97    x='0000'x || '0791 Cathy';

98    put x= $hex.;

99    x='A020'x || '0791 Cathy';

100   put x= $hex.;

101   x='0909'x || '0791 Cathy';

102   put x= $hex.;

103

104  run;

x=202030373931204361746879

x=000030373931204361746879

x=A02030373931204361746879

x=090930373931204361746879

attjooo
Calcite | Level 5

My data above was fake data.

Using $hex. the problematic variable values all starts with

A0203036

and then 34 more digits.

How should I interprete this?

¨'

Tom
Super User Tom
Super User

'A0'x is character that some (particularly Microsoft products) use as a "non-breaking" space.  Convert them to a real space and then STRIP() or LEFT() should work for you.

CHARVAR = left(translate(CHARVAR,' ','A0'x));

attjooo
Calcite | Level 5

Thank you Tom. My problem is now solved.

Patrick
Opal | Level 21

Then please mark the most suitable answer as "correct" and answers which were helpful as "helpful".

attjooo
Calcite | Level 5

I was looking for a button marked "Correct" to press, but I couldn't find any.

Astounding
PROC Star

If the legitimate characters always begin with a "0", then you shouldn't have to figure out what the extra character is.  You could try:

length first $ 1;

first = charvar;

if first ne '0' then charvar = comrpess(charvar, first);

If the first legitimate character might be a nonzero, the problem gets more complex but COMPRESS can probably handle it.  There is a third parameter for COMPRESS, for example, that would let you keep all letters and digits and discard everything else.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 4561 views
  • 0 likes
  • 5 in conversation