BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Anuz
Quartz | Level 8

Hi, 

I have a SAS dataset that has a character column . format $255. 

It has values just like other character columns. 

When I create an excel extract using ods excel, for this particular column it shows the values as for example . This it does for all rows but only for this column. 

I tried compress, trim and strip on this column thinking it could be reacting to spaces maybe. Didnt help. 

 

 Has anyone encountered this issue? What and how can I  check if there are any special characters in that column in the SAS dataset that is causing this issue.

 

 

COMMERCIAL ????????????????????????????????????????????????????????????????????????????????????????????????????

 

The column itself has only the value COMMERCIAL. 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Try explicitly removing the 00 and see if that solves the issue.

 

If you export this to excel I assume you'll see the ? on the name column but not on the name_clean column.

 

data demo;
set sashelp.class (obs=3);
*add invisible tab to field;
name_original = put(name, $hex.);
if _n_ = 2 then name=catt(name, '00'x);
name_hex = put(name, $hex.);
name_clean = compress(name, '00'x , 'so');
name_clean_hex = put(name_clean, $hex.);

keep name:;
run;

proc print data=demo;
run;

View solution in original post

17 REPLIES 17
Reeza
Super User
Try a test, for the column recreate the data set separately, replace the value in that cell with a newly typed value of COMMERCIAL and see what happens.

For example, if the issue was with SASHELP.HEART and the field was status, what happens if you do the following:

data heart;
set sashelp.heart;
variableIssue = "COMMERCIAL";
run;
ods excel file .......;
....

proc report data=heart;
run;

ods excel close;

If the issue doesn't remain, then try the solution below:

variable = compress(variable, , 'so');


Anuz
Quartz | Level 8

@Reeza - Tried that. If I create a new column and set the value to COMMERICAL, it outputs fine into the excel. 

There is something that is hidden in this column that I unable to find. 

 

Reeza
Super User
Correct, which is why I suggested the COMPRESS with the S option which should remove any invisible characters. You could also manually truncate the value to just COMMERCIAL.

variable = substr(variable, 1, 11); and the remainder would be truncated which should fix the issue.

Reeza
Super User
If you really want to see what's there you can format the variable as a hex character and output it then look up the code online to see what the symbol is to remove it or understand what's happening.

data demo;
set sashelp.class (obs=3);
*add invisible tab to field;
name_original = put(name, $hex.);
if _n_ = 2 then name=catt(name, '09'x);
name_hex = put(name, $hex.);
run;

Title 'Note second record differs - has 09 (TAB) instead of 20 (Space)';
proc print data=demo;
run;
Anuz
Quartz | Level 8

@Reeza Tried that. And I basically i see the below as common in all the columns at the end(the new hex column)

 

2020202020202020202020202020202020202020202020202020202020202020202020202020202000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020

 

so that is space (20) like you said. Why is that the strip didnt remove it. what should i use to remove it 

Reeza
Super User
SAS adds trailing spaces always so this is expected. I'd add in a hardcoded value for COMMERCIAL and compare the output from that to one from your field. Look for where they start to differ and identify the different character there. Spaces shouldn't be the issue in Excel.

FYI - are you on SAS 9.4M3+? Before that ODS EXCEL was pre-release and buggy.
Anuz
Quartz | Level 8

@Reeza i am on SAS 9.4M3.

 

I checked like you said 

 

so while the hex value for Commerical shows - 434F4D4D45524349414C

The original column hex values shows as below. So after 4C at the end it has the series of 202020..... how do I remove the trailing 202020  434F4D4D45524349414C2020202020202020202020202020202020202020202020202020202020202020202020202020202000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020

Reeza
Super User

Try explicitly removing the 00 and see if that solves the issue.

 

If you export this to excel I assume you'll see the ? on the name column but not on the name_clean column.

 

data demo;
set sashelp.class (obs=3);
*add invisible tab to field;
name_original = put(name, $hex.);
if _n_ = 2 then name=catt(name, '00'x);
name_hex = put(name, $hex.);
name_clean = compress(name, '00'x , 'so');
name_clean_hex = put(name_clean, $hex.);

keep name:;
run;

proc print data=demo;
run;
Anuz
Quartz | Level 8

@Reeza - You are a star. That worked like a charm. Thank you. Issue resolved.

Anuz
Quartz | Level 8

@Reeza I had tried the compress with s option. 

That column has other values too . COMMERCIAL was just an example. There are other values like NOT FOR SEGMENT etc.

For any values in this particular column it adds these question marks. It has at least 100 question marks after the value. 

Other columns with the same format just display the values with no such question marks added at the end when extracted to excel. 

Reeza
Super User
Show the values as Hex then so you can isolate the character and then compress it specifically.
ballardw
Super User

The ? means that you have some character in the value that does not exist in the current font you are using for Excel. It may mean foreign language or just some form of non-printable character. The question mark in effect says" there is a character here but I do not know what it is or how to display it with current settings".

 

Many question marks means many characters.

Anuz
Quartz | Level 8

@ballardw indeed. 

 

I tried the hex format and it had a series of 20 - spaces at the end 

 

as below 

 

2020202020202020202020202020202020202020202020202020202020202020202020202020202000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020

Reeza
Super User
Actually you have a ton of 00 in there, not as 20 so those are not spaces but a NULL character. It's padded with spaces as well though.
Can you show a full step of how you used COMPRESS and how it didn't work?

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 1540 views
  • 4 likes
  • 4 in conversation