Solved: Re: Remove blank at then end

hhchenfx · Posted 11-19-2017 09:40 PM

Hi Everyone,

I try both Compress and strip and still fail to remove a blank at the end of a string.

That's weird since compress should remove them all.

The 2nd Timothy as a space at the end.

Can anyone have a look at my problem?

Thank you.

HC

data have;
input name $20.;
datalines;
Od, Timothy
Od, Timothy   
;run;

data want; set have;
new=compress(name);
new1=strip(name);
run;

Tom · Posted 11-20-2017 01:42 AM

@hhchenfx wrote:

The problem will be when I merge 1 file have the name of "od, timothy " and other has "od, timothy".

I try to remove all space but it doesn't work and therefore the merge failed.

Look like in this above file, the space is a "character", not something SAS auto filled blank.

If you go to SAS file and highlight that cell, you will see the "blue" cover 01 space after "timothy", which show the space.

In other row, if you highlight the celll, the "blue" cover only character.

Thank you

HC

Did you actually try to merge those two values? SAS knows how to deal with spaces appended to the end of the variables. So in SAS if you compare 'TIM' to 'TIM ' they will match. If your values are NOT matching then you do NOT have a space after the 'y' in one of the variables. Instead you have some other character code that does not display any "ink" on the page.

Look at the hex codes for the character(s) that you are having trouble with. You can use the $HEX format to display the codes as hexadecimal values. Some common codes that you could have might be a tab '09'x, carriage return '0d'x, what Microsoft calls a non-breaking space 'A0'x, a null character '00'x.

Once you find out what character it is you can use the COMPRESS() function to remove it. Or use the TRANSLATE() or TRANWRD() functions to replace it with something else.

name = compress(name,'00090A0DA0'x);

View solution in original post

Astounding · Posted 11-19-2017 09:50 PM

The functions really work. However, they require defining a new variable, with a length for the new variable. When COMPRESS ceates a new variable, it assigns the variable a length of $200 (going by memory here). When STRIP defines a new variable, it assigns the variable the same length as the old variable (going by what makes sense here). So by storing the value returned by the function(s), the software has to add characters back on.

Each variable has only one length. You could code something like this:

data want;

length name $ 19;

set have;

run;

That reduces the length of NAME from $20 to $19. But that affects all observations, and may truncate a nonblank from the end of NAME.

hhchenfx · Posted 11-19-2017 10:00 PM

I have long list of name with variable length so I can't just "cut" the length.

Thus the blank at the end cause me trouble

If I understand what you meant, this code below still not work.


data have;
input name $40.;
datalines;
Od, Timothy
Od, Timothy  

;run;

data want; set have;
length new $ 200;
new=compress(name);
run;

Astounding · Posted 11-19-2017 10:04 PM

Every variable is defined with one, and only one, length. If you run a PROC CONTENTS, you can see that length. You might be able to change the length, but there is still only one length allowed for a variable. The length does not change from one observation to the next.

What do you want to do with the variable, where the trailing blank gets in the way?

hhchenfx · Posted 11-19-2017 10:15 PM

What I meant is the list of name could be

steve, Isber

Kelly, Chen

Jobby, Timothy

So the length of the string is not the same from row to row.

Therefore, the length of name variable is set at $200.

So if the space appear on "steve, Isber ", changing length down to $199 will not work.

This space at the end is so random, I don't know where are they so basically, cutting the length might cut the character.

Astounding · Posted 11-19-2017 10:20 PM

If the length is $200, that is the number of characters used on EVERY observation. You do not have a single blank at the end of "steve, Isber". Instead, you have 188 blanks. It is not random, it is that EVERY observation is using 200 characters.

If you spell out some details about what you are trying to accomplish, there may be a way for the program to use 12 characters instead of 13 or 200 characters. You you will need to show what you are trying to accomplish. For my part, I won't be able to add more for about another 8 hours.

hhchenfx · Posted 11-19-2017 10:34 PM

So I make the length of exact 19 as the longest string and I have no way to remove the space after the Timothy in this below data.

in the cut1 with format cut1 $18., stevenson, Isberger loss the last "r"

I try it all.


data have;
input name $19.;
datalines;
Od, Timothy  
stevenson, Isberger
Swan, Ng  

;run;

proc contents data=have; run;

data want; set have;
new="@"||compress(name," " )||"@";
new1="@"||strip(name)||"@";
new2="@"||trim(name)||"@";

format cut1 $18.;    
cut1=name;
run;

Reeza · Posted 11-19-2017 10:42 PM

What issue is this causing?

SAS creates variables with a fixed length and pads the remaining with spaces. There is no way to change that.

So you need to work around it, and how you do that depends on the issue you're facing.

hhchenfx · Posted 11-19-2017 11:33 PM

The problem will be when I merge 1 file have the name of "od, timothy " and other has "od, timothy".

I try to remove all space but it doesn't work and therefore the merge failed.

Look like in this above file, the space is a "character", not something SAS auto filled blank.

If you go to SAS file and highlight that cell, you will see the "blue" cover 01 space after "timothy", which show the space.

In other row, if you highlight the celll, the "blue" cover only character.

Thank you

HC

Tom · Posted 11-20-2017 01:42 AM

@hhchenfx wrote:

The problem will be when I merge 1 file have the name of "od, timothy " and other has "od, timothy".

I try to remove all space but it doesn't work and therefore the merge failed.

Look like in this above file, the space is a "character", not something SAS auto filled blank.

If you go to SAS file and highlight that cell, you will see the "blue" cover 01 space after "timothy", which show the space.

In other row, if you highlight the celll, the "blue" cover only character.

Thank you

HC

Did you actually try to merge those two values? SAS knows how to deal with spaces appended to the end of the variables. So in SAS if you compare 'TIM' to 'TIM ' they will match. If your values are NOT matching then you do NOT have a space after the 'y' in one of the variables. Instead you have some other character code that does not display any "ink" on the page.

Look at the hex codes for the character(s) that you are having trouble with. You can use the $HEX format to display the codes as hexadecimal values. Some common codes that you could have might be a tab '09'x, carriage return '0d'x, what Microsoft calls a non-breaking space 'A0'x, a null character '00'x.

Once you find out what character it is you can use the COMPRESS() function to remove it. Or use the TRANSLATE() or TRANWRD() functions to replace it with something else.

name = compress(name,'00090A0DA0'x);

Kurt_Bremser · Posted 11-20-2017 03:02 AM

If you have a character variable with questionable content, do this:

- define a new variable with double the length

- use newvar = put(oldvar,$hexN.); where N = length of new variable.

Now you can inspect the hex codes for "funny" characters which you then can filter using @Tom's example.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Ksharp · Posted 11-20-2017 07:55 AM

Maybe you have some special blank at the end of variable.

want=prxchange('s/\s+$//',-1,name);

hhchenfx · Posted 11-21-2017 05:21 PM

Thank you for your help.

I try to upload the SAS datafile here so you can see but there is error with the upload.

I will comeback to this issue later but I really appreciate your help.

HC

Classroom Training Available!