BookmarkSubscribeRSS Feed
rodrichiez
Quartz | Level 8

Hello guys,

 

I'm trying to remove blanks from a string using compress or trim functions but the result I'm getting looks like this one below, with an unrecognize characters and  I don't have idea what is happening here.

 

rodrichiez_0-1619528924585.png

This is my code:

PROC SQL;
CREATE TABLE WORK.QUERY_FOR_FILTER AS
SELECT t1.Interest_Plan,
(COMPRESS(t1.Interest_Plan)) AS COMPRESS,
(TRIM(LEFT(t1.Interest_Plan))) AS TRIM,
(COUNT(t1.Account_Nbr)) AS COUNT_of_Account_Nbr
FROM WORK.FILTER_FOR_FACT t1
GROUP BY t1.Interest_Plan,
(CALCULATED COMPRESS),
(CALCULATED TRIM);
QUIT;

8 REPLIES 8
Kurt_Bremser
Super User

Create a control variable for the source value.

Define it with double the length of interest_plan, and use the $HEX format to store the hex representation of the source value there. This way you can see if there's some crazy stuff in there after a series of blanks.

Tom
Super User Tom
Super User

TRIM() just removes the trailing spaces, which will just be added back once you store the value into a SAS character variable. SAS character variables are fixed length and padded with spaces.  So it will not have any effect.

LEFT() will remove leading spaces, so that might have an impact.

 

The COMPRESS() function will remove all of the spaces, even those in the middle of the string between the words.

 

It is hard to tell what is in your data from looking a photograph of how some viewer displays the data.  Print the data using the $HEX format to see what characters are in the data.  A space will show as the hexadecimal code 20 (which is 32 in base 10).

proc print data=QUERY_FOR_FILTER(obs=3) ;
 format _character_ $hex. ;
run;
rodrichiez
Quartz | Level 8

Hello @Tom 

 

See attached the results.But I still don't get it on how to normalize this column to perform a group by.

 

AlanC
Barite | Level 11

XLSX is not a good format to post results. Use text in the future so the hex values can be seen. Here is pseudo-code (aka I can't test it).

 

PROC SQL;
       CREATE TABLE WORK.QUERY_FOR_FILTER AS
       SELECT STRIP(t1.Interest_Plan) AS IntPlan,

                      COUNT(t1.Account_Nbr) AS Count
       FROM WORK.FILTER_FOR_FACT t1
       GROUP BY IntPlan;
QUIT;

 

 

https://github.com/savian-net
rodrichiez
Quartz | Level 8

@AlanC 

It happen the same using Strip function.

 

rodrichiez_0-1619544655300.png

 

AlanC
Barite | Level 11

Well, it looks like you are reading the data in wrong from the get-go so not really an issue with blanks. It appears that your encoding is wrong somewhere or else you are reading in a source that is not text. As mentioned, what is the data source? It may be UTF-8/16 (or whatever). 

https://github.com/savian-net
Tom
Super User Tom
Super User

I do not understand what you are actually trying to do.

But the data you posted is showing you have a lot of gibberish in your data.  For example it appears to have a lot of binary zero bytes.

 

image.png

Where did you get this data? 

Do you have it in a SAS dataset? 

Some other source?

Where did that source get the data?

AlanC
Barite | Level 11

Use the STRIP function. If the raw data is such that you can pull it into a hex editor, try that. UltraEdit, Notepad++, and (my favorite for hex issues) 010 Editor can all show you the values. Look at encoding setup as well to make sure you are reading the values correctly. 

https://github.com/savian-net

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2790 views
  • 0 likes
  • 4 in conversation