BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
stancemcgraw
Obsidian | Level 7

Hi,

 

  I imported an excel spreadsheet and one of the variables seems to have some extra characters of some sort. It is a character variable called "Deviation" and the responses are "YES", "NO", and "N/A", yet it thinks I have two "NO" groups and two "YES" groups when I do a frequency table. 

data have;

deviation        frequency

N/A                   639

YES                 20

YES                 1

NO                  524

NO                  32

 

data want;

Deviation        Frequency

N/A                   639

YES                 21

NO                  556

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You will need to check the meaning of those COMPRESS() function options in the documentation.

https://www.google.com/search?q=%40sas.com+compress()+function

 

The 2F on your first line is just the slash character, /

But the A0 in your last row is the dreaded non-breaking space (or at least that is what Microsoft has decided to use that particular non-printing character to mean and lot of other software has adopted it.).

 

See what happens if you just remove the non-breaking space.  While you are at in remove any leading spaces.

 

Here is code to remove the non-breaking spaces, left align (aka remove leading spaces) and also generate hex codes for the first 10 characters.  Then run PROC FREQ and show both the text and the hexcode version on the same line so it is easier to see what letter the hex codes represent.

data temp;
  set have;
  all_managers = left(compress(all_managers,'A0'x));
  hex=put(all_managers,$hex20.);
run;

proc freq data=temp;
  tables all_managers*hex / list;
run;

 

View solution in original post

10 REPLIES 10
stancemcgraw
Obsidian | Level 7
I did that and nothing happened...
Tom
Super User Tom
Super User

It helps to look at the output using the regular old plan text LISTING destination instead of some graphically complex ODS destination.

You can try using the $QUOTE. format to help see the leading spaces. 

You can use the $HEX. format to see what characters are actually in the variable.

 

Most likely you have leading spaces. ('20'X)  You might have trailing linefeeds ('0A'x).  It is possible you have some non-breaking spaces ('A0'x) on either side.

stancemcgraw
Obsidian | Level 7
I'm not sure what the Quote and HEX formats are....
stancemcgraw
Obsidian | Level 7

I used the following code and got this

data temp;

set have;

all_managers = compress(all_managers, , 'ad');

run;

proc freq data=temp;

tables all_managers;

format all_managers $hex30.;

run;

 

Deviation      FREQUENCY

2F202020       639

A0202020         33

 

Not sure what this means....

Reeza
Super User

Try removing the formats first and see if it solved the issue.
I think you need to add the K option if you're using the ad option which would remove all numeric and alphabetic characters. I suggested the s modifier to remove white space.

The HEX format, shows you the values as hex code. So 2F, A0, 20 are all different white space characters.
2F - /
20 - space
A0 is a white space that's likely causing part of your issue.

data temp;
set have;
all_managers = compress(all_managers, , 's');
run;

proc freq data=temp;
tables all_managers;
run;



stancemcgraw
Obsidian | Level 7
I tried what you posted (Reeza) and nothing changed...
Reeza
Super User
Then you need to be able to post an example of your data so we can replicate the issue.
Tom
Super User Tom
Super User

You will need to check the meaning of those COMPRESS() function options in the documentation.

https://www.google.com/search?q=%40sas.com+compress()+function

 

The 2F on your first line is just the slash character, /

But the A0 in your last row is the dreaded non-breaking space (or at least that is what Microsoft has decided to use that particular non-printing character to mean and lot of other software has adopted it.).

 

See what happens if you just remove the non-breaking space.  While you are at in remove any leading spaces.

 

Here is code to remove the non-breaking spaces, left align (aka remove leading spaces) and also generate hex codes for the first 10 characters.  Then run PROC FREQ and show both the text and the hexcode version on the same line so it is easier to see what letter the hex codes represent.

data temp;
  set have;
  all_managers = left(compress(all_managers,'A0'x));
  hex=put(all_managers,$hex20.);
run;

proc freq data=temp;
  tables all_managers*hex / list;
run;

 

stancemcgraw
Obsidian | Level 7
Thank you! That worked!!!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 9265 views
  • 6 likes
  • 3 in conversation