- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello guys,
I'm trying to remove blanks from a string using compress or trim functions but the result I'm getting looks like this one below, with an unrecognize characters and I don't have idea what is happening here.
This is my code:
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_FILTER AS
SELECT t1.Interest_Plan,
(COMPRESS(t1.Interest_Plan)) AS COMPRESS,
(TRIM(LEFT(t1.Interest_Plan))) AS TRIM,
(COUNT(t1.Account_Nbr)) AS COUNT_of_Account_Nbr
FROM WORK.FILTER_FOR_FACT t1
GROUP BY t1.Interest_Plan,
(CALCULATED COMPRESS),
(CALCULATED TRIM);
QUIT;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Create a control variable for the source value.
Define it with double the length of interest_plan, and use the $HEX format to store the hex representation of the source value there. This way you can see if there's some crazy stuff in there after a series of blanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
TRIM() just removes the trailing spaces, which will just be added back once you store the value into a SAS character variable. SAS character variables are fixed length and padded with spaces. So it will not have any effect.
LEFT() will remove leading spaces, so that might have an impact.
The COMPRESS() function will remove all of the spaces, even those in the middle of the string between the words.
It is hard to tell what is in your data from looking a photograph of how some viewer displays the data. Print the data using the $HEX format to see what characters are in the data. A space will show as the hexadecimal code 20 (which is 32 in base 10).
proc print data=QUERY_FOR_FILTER(obs=3) ;
format _character_ $hex. ;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Tom
See attached the results.But I still don't get it on how to normalize this column to perform a group by.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
XLSX is not a good format to post results. Use text in the future so the hex values can be seen. Here is pseudo-code (aka I can't test it).
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_FILTER AS
SELECT STRIP(t1.Interest_Plan) AS IntPlan,
COUNT(t1.Account_Nbr) AS Count
FROM WORK.FILTER_FOR_FACT t1
GROUP BY IntPlan;
QUIT;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Well, it looks like you are reading the data in wrong from the get-go so not really an issue with blanks. It appears that your encoding is wrong somewhere or else you are reading in a source that is not text. As mentioned, what is the data source? It may be UTF-8/16 (or whatever).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I do not understand what you are actually trying to do.
But the data you posted is showing you have a lot of gibberish in your data. For example it appears to have a lot of binary zero bytes.
Where did you get this data?
Do you have it in a SAS dataset?
Some other source?
Where did that source get the data?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use the STRIP function. If the raw data is such that you can pull it into a hex editor, try that. UltraEdit, Notepad++, and (my favorite for hex issues) 010 Editor can all show you the values. Look at encoding setup as well to make sure you are reading the values correctly.