BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Emma_at_SAS
Lapis Lazuli | Level 10

I have a character variable with 3 levels

1/4 lb

1/2 lb

3/4 lb

and I want to change the levels to show them with decimilas

0.25 lb

0.50 lb

0.75 lb

I tried 

if var= "1/4 lb" then new_var="0.25 lb";
if var= "1/2 lb" then new_var="0.50 lb";
if var= "3/4 lb" then new_var="0.75 lb";

but my new_var is all missing observations. I wonder if that is because I have a space between the numbers and lb. I wonder where I am making a mistake and thank you for your suggestions.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Character variables when using = are compared first character to first character, 2nd to 2nd and so on. As soon as one does not match SAS says "not equal". So if your variable starts with a space in the data then it is not equal. If any of the other characters are not exactly the same then they are not equal. If you variable does not have a space and you compare it to "1/4 lb" with a space, then it is not equal.

 

Sometimes what you see misses leading spaces because they are suppressed.

You can try code like this to see some values;

data _null_;
    set yourdataset obs=10;
    newvar=quote(var);
   put newvar=;
run;

Data _null_ does not create a data set. This just writes some information to the log for the first 10 records (the Obs=10) so you can see some examples of your data. The value of newvar will have quotes around your current Var values.

If you see leading blanks before the text such as possibly "   1/4 lb" you can modify your comparison to use

 

if strip(var)= "1/4 lb" then new_var="0.25 lb";

 

if your values do not have the space, or more spaces between 1/4 and lb then you need to make your comparison string match what is actually in the data for Var.

 

Another potential problem since you did not share an entire data step is if you have used New_var somewhere before this in the code that would make SAS think the variable could be numeric as you can't assign ".25 lb" to a numeric variable. This would likely generate a warning in the log one way or another.

 

View solution in original post

11 REPLIES 11
ballardw
Super User

Character variables when using = are compared first character to first character, 2nd to 2nd and so on. As soon as one does not match SAS says "not equal". So if your variable starts with a space in the data then it is not equal. If any of the other characters are not exactly the same then they are not equal. If you variable does not have a space and you compare it to "1/4 lb" with a space, then it is not equal.

 

Sometimes what you see misses leading spaces because they are suppressed.

You can try code like this to see some values;

data _null_;
    set yourdataset obs=10;
    newvar=quote(var);
   put newvar=;
run;

Data _null_ does not create a data set. This just writes some information to the log for the first 10 records (the Obs=10) so you can see some examples of your data. The value of newvar will have quotes around your current Var values.

If you see leading blanks before the text such as possibly "   1/4 lb" you can modify your comparison to use

 

if strip(var)= "1/4 lb" then new_var="0.25 lb";

 

if your values do not have the space, or more spaces between 1/4 and lb then you need to make your comparison string match what is actually in the data for Var.

 

Another potential problem since you did not share an entire data step is if you have used New_var somewhere before this in the code that would make SAS think the variable could be numeric as you can't assign ".25 lb" to a numeric variable. This would likely generate a warning in the log one way or another.

 

Reeza
Super User
Quick trick, run a proc freq on the VAR column. Copy the values from the output into your VAR="value copied from proc freq". That helps you avoid these issues.

Note that it's also case sensitive - so if it's Lb or LB or lbs that won't match either.
ballardw
Super User

@Reeza wrote:
Quick trick, run a proc freq on the VAR column. Copy the values from the output into your VAR="value copied from proc freq". That helps you avoid these issues.

Note that it's also case sensitive - so if it's Lb or LB or lbs that won't match either.

Unless the issue is leading blanks. The output for Proc Freq will be left justified and hide the leading spaces.

 

I have a data source that plays all sorts of stupidity on their data and had a range type value of "16000-30000" show up with two rows for each range in my proc freq and found values with leading spaces depending on which sub-agency collected the data.

 

 

Emma_at_SAS
Lapis Lazuli | Level 10

Thank you @ballardw  and @Reeza!

Using strip(var) instead of var solved my problem!

@Reeza, I tried the proc freq and the levels of the variable for frequencies turn out as the three levels I was expecting and I assume SAS recovers extra spacing. 

I just need to introduce var_dv with the most number of characters first because that is how many levels reserved for var_dv so I do not get truncated on variable level values.

Thank you both!

ballardw
Super User

@Emma_at_SAS wrote:

Thank you @ballardw  and @Reeza!

Using strip(var) instead of var solved my problem!

@Reeza, I tried the proc freq and the levels of the variable for frequencies turn out as the three levels I was expecting and I assume SAS recovers extra spacing. 

I just need to introduce var_dv with the most number of characters first because that is how many levels reserved for var_dv so I do not get truncated on variable level values.

Thank you both!


Instead of worrying about order of assignment use a LENGTH statement to specify how many characters you expect a variable to hold before using it the first time. Its very simple:

 

Length var_dv $ 10;

for example the $ says it will be a character variable and the 10 is how many characters you expect to use. If you need to later modify the program with new values later you check to see if that is long enough to hold the new values and increase as needed.

Caution: same named variables with different lengths will cause warnings and possible truncation of values when combining data sets.

Emma_at_SAS
Lapis Lazuli | Level 10
Great. Thank you, @ballardw!
Tom
Super User Tom
Super User

@ballardw wrote:

@Reeza wrote:
Quick trick, run a proc freq on the VAR column. Copy the values from the output into your VAR="value copied from proc freq". That helps you avoid these issues.

Note that it's also case sensitive - so if it's Lb or LB or lbs that won't match either.

Unless the issue is leading blanks. The output for Proc Freq will be left justified and hide the leading spaces.

 

I have a data source that plays all sorts of stupidity on their data and had a range type value of "16000-30000" show up with two rows for each range in my proc freq and found values with leading spaces depending on which sub-agency collected the data.

 

 


The plain old text output of PROC FREQ does not suppress the leading spaces. 

                                   Cumulative    Cumulative
x         Frequency     Percent     Frequency      Percent
-----------------------------------------------------------
  1              1       25.00             1        25.00
  aaaa           1       25.00             2        50.00
1                1       25.00             3        75.00
aaaa             1       25.00             4       100.00

It is just the "pretty" output in ODS that is confused.

image.png

Emma_at_SAS
Lapis Lazuli | Level 10

Thank you @ballardw @Reeza @Tom for your discussion on spaces and leading spaces! very helpful

Sajid01
Meteorite | Level 14

Hello @Emma_at_SAS 
One approach is to use proc format and it should serve your purpose

proc format;
value $val
 "1/4lb" =0.250
 "1/2lb"=0.500
 "3/4lb"=0.750;
 run;
data test;
format var $val.;
input var1 $ var2 $;
var=cats(var1,var2);
drop var1 var2;
datalines;
1/4 lb
1/2 lb
3/4 lb
;

The values will be appear as shown below. The original values remain intact.

Sajid01_0-1625773790102.png

 

Emma_at_SAS
Lapis Lazuli | Level 10
Thank you @Sajid01. It is like another approach to solve my problem. Thanks
Ksharp
Super User
data have;
input have $20.;
want=catx(' ',put(input(resolve(cats('%sysevalf(',scan(have,1,' '),')')),best.),8.2 -l),scan(have,-1,' '));
cards;
1/4 lb
1/2 lb
3/4 lb
;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1552 views
  • 8 likes
  • 6 in conversation