- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello:
I am trying to remove space inside strings, e.g. remove the space before "1" in the string "XXX-1234. 1". I have many values in the same format i.e. "XXX-####.#". Many of them have space after the "." and before the last digit. I would like to remove any space in the strings. Some of them can be removed successfully by using compress(). But some of them, the space cannot be removed by compress().
Not sure if the space was from some special char or something else than typing. How can I identify and remove them.
Any help?
Thanks,
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Next code tells compress function which characters to keep,
a=alpha, d=digits, p=punctuation, k=keep:
new_var = compress(var, ,'kadp');
try it and adapt to your needs.
More characters in the link posted by @Reeza
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Spaces are not necessarily blanks, so there's an option within the COMPRESS functions to handle multiple different types of spaces, it's the third example in the documentation. That adding in that modifier (third parameter) and see if it works for you.
data one;
x='1 2 3 4 5';
y=compress(x, , 's');
put y=;
run;
@sasecn wrote:
Hello:
I am trying to remove space inside strings, e.g. remove the space before "1" in the string "XXX-1234. 1". I have many values in the same format i.e. "XXX-####.#". Many of them have space after the "." and before the last digit. I would like to remove any space in the strings. Some of them can be removed successfully by using compress(). But some of them, the space cannot be removed by compress().
Not sure if the space was from some special char or something else than typing. How can I identify and remove them.
Any help?
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply.
I tried this and that doesn't work. What confused me is that why the compress() works on some values, but not all of them. My data was imported from excel file. I am thinking if there is anything wired in the format or the space was caused by special char ...
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Next code tells compress function which characters to keep,
a=alpha, d=digits, p=punctuation, k=keep:
new_var = compress(var, ,'kadp');
try it and adapt to your needs.
More characters in the link posted by @Reeza
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, the options of keep chars needed worked in my situation. I also found another way to delete the non-breaking space from excel by using compress(var, 'A0'x). But your solution is best to keep chars needed.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@sasecn wrote:
Thanks for your reply.
I tried this and that doesn't work. What confused me is that why the compress() works on some values, but not all of them. My data was imported from excel file. I am thinking if there is anything wired in the format or the space was caused by special char ...
If by "imported" you mean that you use Proc Import or one of the wizards that can be a significant factor. Some of the characters involved could be the vertical spaces that Excel uses when data is entered with the Alt-Enter, or actual end of line characters if such were in data that is written to Excel from a data base.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I used the import wizard from SAS EG to import the excel file. I did check the source excel file, and find many spaces caused by typo, tab, alt+enter ... The excel file was created by other users, so I cannot exam one by one of them. The compress() with options can fix the problem, hopefully fixed all the problems.
Since you mentioned that different import may have different results, wondering if there is such a way to import the excel that can eliminate the problems (unexpected space at the beginning, between and end of strings, special chars ...) at the beginning?
Thanks,