BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BLE
Obsidian | Level 7 BLE
Obsidian | Level 7

i need to convert a character var that contains negative numbers into a numeric var.

The following is crashing because SAS is reading the - in front of the negative numbers found in Old_Char_Var as a character that can not be turned into a number.

New_Num_Var = input(Old_Char_Var,12.0);

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

You're welcome.

 

Yes, all printable blanks from CharVar1 will be found in CharVar2, unless CharVar2 was previously defined with an insufficient length to accommodate all characters contained in the function result. By default, CharVar2 is given the length of CharVar1 (which will always be sufficient, because the length of the function result is less than or equal to that length).

 

In the expression compress(CharVar2) all blanks will be removed. The assignment to variable CharVar3, however, may "create" trailing blanks, depending on the length of CharVar3. For example, if CharVar2 is a character variable of length 5 and it contains the string 'a b' (i.e. one blank between 'a' and 'b' and the two unavoidable trailing blanks because of its length 5) and CharVar3 has not been assigned a length before, CharVar3 will be created as a character variable of the same length as CharVar2, i.e. length 5, containing the string 'ab' and three trailing blanks. Nevertheless, the equation CharVar3='ab' holds, because trailing blanks are ignored in string comparisons.

View solution in original post

14 REPLIES 14
Reeza
Super User
data test;
x="-12.4";
y=input(x, best12.);
put x= y=;
run;

Use a BEST12. format or something else that's appropriate, in this case 12.0 doesn't appear to be. 

BLE
Obsidian | Level 7 BLE
Obsidian | Level 7

Nope, I changed from 12.0 to best. and the - still crashes the run.

Reeza
Super User

1. Did the code I post run correctly?

2. Post your code and log, and a sample of the data you're trying to convert.

BLE
Obsidian | Level 7 BLE
Obsidian | Level 7
Revised code that doesn't work:

New_Num_Var = input(Old_Char_Var,best.);



Log:

NOTE: Invalid argument to function INPUT at line # column #.

Old_Char_Var=-


BLE
Obsidian | Level 7 BLE
Obsidian | Level 7

sample data in Old_Char_Var:

 

-5

-2

0

1

10

255

 

Reeza
Super User

So the old char value is only a negative sign (-). What would that be translated to?

 

You could use 

 

input(number, ?? best12.)

 

If you say something doesn't work unless you explain it and post details that means nothing to anyone else. 

BLE
Obsidian | Level 7 BLE
Obsidian | Level 7
The old char value has a number followed by the negative sign, so it is not just a negative sign.

Ex: -5

Ex: -2



The log shows that when SAS hits the value in old_char_var it can't go beyond the negative sign so that is all you see in the log even though there is a number after the negative sign.



Thanks, the ?? made the difference. Would you mind explaining exactly what the ?? says to SAS in this function statement?

New_Num_Var = input(Old_Char_Var, ?? best.);


ballardw
Super User

Show some values, the log and any error messages.

 

Also see if your data might have a SPACE between the - and the digits.

There is a chance the character is not minus sign but an emdash which is a different kettle of fish but looks similar to the -. You may need to use something like

 

if anypunct(first(old_char_var)) >0 and Rank( first(old_char_var) ) ne Rank('-') then put 'First character is not a -';

 

Then you might need to use translate.

One fun thing, apparently there are some characters that this forum "helps" by translating in some conditions. So what we see may not be what you post.

FreelanceReinh
Jade | Level 19

The "??" is an informat modifier which suppresses error messages (and the like) when invalid data are being read. However, this will leave you with missing values in variable New_Num_Var where it possibly should contain (e.g. negative) values!

 

So, to avoid losing information, you should investigate exactly what kind of "invalid data" Old_Char_Var contains in the observations where INPUT (without the "??") fails.

 

You could post the output of the following PROC PRINT step to help us to give you more specific instructions to resolve the issue.

proc print data=your_dataset(obs=5); /* please replace "your_dataset" by the name of your dataset */
where new_num_var=. & old_char_var ne ' ';
format old_char_var $hex16.;
var old_char_var;
run;
BLE
Obsidian | Level 7 BLE
Obsidian | Level 7
Bummer - I thought the ?? was identifying the negative sign. I see now it is just ignoring problematic records.

Ugh.

Sorry about stating it didn't work without sharing additional info.

I am working with confidential data so I need to create a mock-up in order to share the problem, though I see that negative values do get through the statement and don't cause the input statement to fail so I think you are correct that some of the characters that look like a negative sign must be something else.

In addition to the field with the - I also have a field with hidden characters that is causing me some aggravation.

Thanks for all of your help. I have much work to do before my question will be tight enough to further clarify.




BLE
Obsidian | Level 7 BLE
Obsidian | Level 7

I found using the following compress statement gets rid of the hidden characters in a way that allows me to convert my data into a numeric field.

CharVar2 = compress(CharVar1,,’kw’);

NumVar = input(CharVar2,best.);

 

I know ‘kw’ stands for k=keep, w=writable and it results in a new variable where all the writable characters are kept and all the non writable characters are deleted, so the hidden characters are eliminated and my conversion problem is solved.

 

But I would like to better understand this compress statement so that I am in full command of its use.

Syntax is: Compress(source,characters,modifiers);

For source I am using my original character variable.

For modifier, I am using  ‘kw’ which means keep only printable characters.

How does the fact that I am not explicitly stating the second parameter (characters) fit with the results?

Am I saying for my chosen character var (source = CharVar1) replace all blanks (characters = blank) with writable characters (modifiers = ‘kw’)? What about all the non-blanks that have unprintable characters, how are they being replaced if the second parameter says only replace blanks?

 

Also, using compress without characters or modifiers results in the elimination of all blanks (leading, trailing and any in the middle of the text string). 

Is that because the default for the third argument (modifiers) is to remove and the default for the second argument (characters) is blanks?

 

MOST IMPORTANTLY:

For values of CharVar1 that have blanks but no hidden characters, does CharVar2 = CharVar1 or does CharVar2 = Compress(CharVar1)?

FreelanceReinh
Jade | Level 19

@BLE wrote:

 

Am I saying for my chosen character var (source = CharVar1) replace all blanks (characters = blank) with writable characters (modifiers = ‘kw’)?


No, COMPRESS does not replace characters, it only removes characters (if any).



What about all the non-blanks that have unprintable characters, how are they being replaced if the second parameter says only replace blanks?


Your expression compress(CharVar1,,’kw’) uses as second argument what the documentation calls a "null argument." The documentation continues: "A null argument is treated as a string that has a length of zero."


The 'w' modifier in the third argument "adds printable characters to the [here: empty] list of characters" in the second argument. So, because of the 'k' modifier, all printable characters contained in CharVar1 are kept in the function result.


The result would be the same if you supplied a blank (' ') as the second parameter, because the blank is a printable character.

If CharVar1 contains non-printable characters and the second parameter is ' ' (a blank) and the third argument is not used, blanks in CharVar1 (if any) will be removed in the result, but the non-printable characters will persist.

 

In contrast, if the second argument is a null argument (like in compress(c,)), nothing will be removed, i.e., the result will be equal to the first argument.



Also, using compress without characters or modifiers results in the elimination of all blanks (leading, trailing and any in the middle of the text string). 

Is that because the default for the third argument (modifiers) is to remove and the default for the second argument (characters) is blanks? 


More or less. This is just how COMPRESS works with only one argument (see the table in section "The Basics" in the documentation linked above). This has also to do with backward compatibility, because at least up to SAS 8 the third argument (modifiers) did not exist and not using the second argument meant "remove blanks."



MOST IMPORTANTLY:

For values of CharVar1 that have blanks but no hidden characters, does CharVar2 = CharVar1 or does CharVar2 = Compress(CharVar1)?


The first equality is true (assuming "hidden" means "non-printable"). The second is false in general, because the blanks removed by the COMPRESS function to the right of the equals sign would violate the equality, unless they were all trailing blanks. Trailing blanks are ignored in string comparisons.

BLE
Obsidian | Level 7 BLE
Obsidian | Level 7
Thank you, Thank you, Thank you!!!!!!!!



My last clarifying question (I hope): Will CharVar2 have all of the same writable blanks that were found in CharVar1 meaning I will need to do CharVar3=compress(CharVar2) in order to get rid of all writable blanks if I need to work with this data as a character var but don't want to see any leading, trailing or in-between the text string blanks?


FreelanceReinh
Jade | Level 19

You're welcome.

 

Yes, all printable blanks from CharVar1 will be found in CharVar2, unless CharVar2 was previously defined with an insufficient length to accommodate all characters contained in the function result. By default, CharVar2 is given the length of CharVar1 (which will always be sufficient, because the length of the function result is less than or equal to that length).

 

In the expression compress(CharVar2) all blanks will be removed. The assignment to variable CharVar3, however, may "create" trailing blanks, depending on the length of CharVar3. For example, if CharVar2 is a character variable of length 5 and it contains the string 'a b' (i.e. one blank between 'a' and 'b' and the two unavoidable trailing blanks because of its length 5) and CharVar3 has not been assigned a length before, CharVar3 will be created as a character variable of the same length as CharVar2, i.e. length 5, containing the string 'ab' and three trailing blanks. Nevertheless, the equation CharVar3='ab' holds, because trailing blanks are ignored in string comparisons.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 10652 views
  • 2 likes
  • 4 in conversation