- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Posted 02-11-2010 04:00 AM
(2732 views)
Hi:
It appears to be widely accepted that the put function only produces characters and not numeric. However, I created a proc format where I used the procedure to assign numbers within the range 1-100 into 4 different groups which I called 1,2,3,4 (kind of like quartiles). When I was done, I wanted to create a new variable which would then add up all these numbers to obtain ranking. No matter what I tried, the numbers coming out of my calculations made little sense until I used the put function writing it as
var1 = put (old_variable, name_of_value_in_proc_format) and miraculously, this worked and I got the numeric output that I wanted.
can anyone help me square this with the widespread notion that put only produces characters ? Thanks
It appears to be widely accepted that the put function only produces characters and not numeric. However, I created a proc format where I used the procedure to assign numbers within the range 1-100 into 4 different groups which I called 1,2,3,4 (kind of like quartiles). When I was done, I wanted to create a new variable which would then add up all these numbers to obtain ranking. No matter what I tried, the numbers coming out of my calculations made little sense until I used the put function writing it as
var1 = put (old_variable, name_of_value_in_proc_format) and miraculously, this worked and I got the numeric output that I wanted.
can anyone help me square this with the widespread notion that put only produces characters ? Thanks
10 REPLIES 10
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Could You please give furthe information of how you created this and also the format which you have created. Put gives only character otput unless used with in any expression which automatically converts character values into numeric values.
Is there any note in the log saying character values is converted to numeric values.
Is there any note in the log saying character values is converted to numeric values.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
well say I write
proc format;
value FmtTVRnk
1 - 25 = 4
26 - 50 = 3
51 - 75 = 2
76 - 100 = 1;
value FmtWbRnk
1 - 25 = 4
26 - 50 = 3
51 - 75 = 2
76 - 100 = 1;
value FmtPrRnk
1 - 25 = 4
26 - 50 = 3
51 - 75 = 2
76 - 100 = 1;
then somewhere later in the code, I have a line like:
format AnnPay dollar12.
WebRank FmtWbRnk.
PressRank FmtPrRnk.
TVRank FmtTVRnk. ;
And finally,
when I use
var1=WebRank;
var2=PressRank;
var3=TVRank;
Total = var1 + var2+ var3;
I got nonsensical value for Total like 267 etc. Only when I used
var1=put(WebRank,FmtWbRnk.);
var2=put(PressRank,FmtPrRnk.);
var3=put(TVRank,FmtTVRnk.);
Total = var1 + var2+ var3;
did I get a total value in the range 3 - 12 which was what I was expecting.
To the person who thought my use of the expression "widespread notion" was a subtle dig at arguing against the accuracy of the fact, I am not. I just want to know why using a put gave me numeric output. I am too new to SAS to put on a snarkfest here. I just want to learn and I appreciate your attempt to teach me.
proc format;
value FmtTVRnk
1 - 25 = 4
26 - 50 = 3
51 - 75 = 2
76 - 100 = 1;
value FmtWbRnk
1 - 25 = 4
26 - 50 = 3
51 - 75 = 2
76 - 100 = 1;
value FmtPrRnk
1 - 25 = 4
26 - 50 = 3
51 - 75 = 2
76 - 100 = 1;
then somewhere later in the code, I have a line like:
format AnnPay dollar12.
WebRank FmtWbRnk.
PressRank FmtPrRnk.
TVRank FmtTVRnk. ;
And finally,
when I use
var1=WebRank;
var2=PressRank;
var3=TVRank;
Total = var1 + var2+ var3;
I got nonsensical value for Total like 267 etc. Only when I used
var1=put(WebRank,FmtWbRnk.);
var2=put(PressRank,FmtPrRnk.);
var3=put(TVRank,FmtTVRnk.);
Total = var1 + var2+ var3;
did I get a total value in the range 3 - 12 which was what I was expecting.
To the person who thought my use of the expression "widespread notion" was a subtle dig at arguing against the accuracy of the fact, I am not. I just want to know why using a put gave me numeric output. I am too new to SAS to put on a snarkfest here. I just want to learn and I appreciate your attempt to teach me.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It is impossible to answer your question without seeing the data that is causing the problem. Since your formats representvalues up to 100 I find it hard to believe that a sum of these would be less than 12. I would assume that 267 would be more reasonable.
Or were you expecting the sum of numerics to somehow represent the sum of the formated values.
The correct way to handle this would be more like:
var1=input(put(WebRank,FmtWbRnk.), best.);
var2=input(put(PressRank,FmtPrRnk.), best.);
var3=input(put(TVRank,FmtTVRnk.), best.);
Or were you expecting the sum of numerics to somehow represent the sum of the formated values.
The correct way to handle this would be more like:
var1=input(put(WebRank,FmtWbRnk.), best.);
var2=input(put(PressRank,FmtPrRnk.), best.);
var3=input(put(TVRank,FmtTVRnk.), best.);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The "widespread notion" is a fact. Perhaps your origional values could not be automatically converted to numerics, but the formatted ones could.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The "data" returned from PUT funtion is always character.
It may be assigned to a numeric variable with automatic conversion to numeric, which I think is what is happening to the OP. I could not follow the OPs process very well.
It may be assigned to a numeric variable with automatic conversion to numeric, which I think is what is happening to the OP. I could not follow the OPs process very well.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When you assign a format to a variable, it only alters the way the variable is displayed in output not the internal value of the variable. Therefore, in your code
format AnnPay dollar12.
WebRank FmtWbRnk.
PressRank FmtPrRnk.
TVRank FmtTVRnk. ;
var1=WebRank; etc.....
var1 (and var2 and var3) are assigned the actual values(1-100) not the formated values and
Total = var1 + var2 + var3 would range invalue of 1-300.
When you use the put function in an assignment statement as
var1 = put(WebRank,FmtWbRnk.);
You are assigning the formatted value of Webrank to var1. Based on your format var1 would take on the values of 1-4. Therefore the sum of var1,2,3 would be in the range you expected.
As for the put statement, yes it converts to character variables. However, by summing a character variable that takes on numeric values SAS will convert to a numeric. There should be a message in your log stating such.
Hope this helps Correction: Total = var1 + var2 + var3 would range invalue of 3-300.
Message was edited by: LAP
format AnnPay dollar12.
WebRank FmtWbRnk.
PressRank FmtPrRnk.
TVRank FmtTVRnk. ;
var1=WebRank; etc.....
var1 (and var2 and var3) are assigned the actual values(1-100) not the formated values and
Total = var1 + var2 + var3 would range invalue of 1-300.
When you use the put function in an assignment statement as
var1 = put(WebRank,FmtWbRnk.);
You are assigning the formatted value of Webrank to var1. Based on your format var1 would take on the values of 1-4. Therefore the sum of var1,2,3 would be in the range you expected.
As for the put statement, yes it converts to character variables. However, by summing a character variable that takes on numeric values SAS will convert to a numeric. There should be a message in your log stating such.
Hope this helps Correction: Total = var1 + var2 + var3 would range invalue of 3-300.
Message was edited by: LAP
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
LAP, thanks for your very clear and concise answer. I appreciate it a great deal. BTW, is there a more efficient/elegant way I could have achieved the same objective ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
From my personal experience, it is better to use a LENGTH statement and declare your SAS variables, both CHARACTER and NUMERIC type. Even though you may notice SAS doing conversions for you, either "CHARACTER TO NUMERIC" or "NUMERIC TO CHARACTER", I believe it's better to know what results you are getting rather than seeing one of these NOTE diagnostics and not actually knowing what you may end up with in your SAS variable's value.
And, so, if you need to have a SAS NUMERIC type variable created from an incoming PUT function result, use the INPUT function in your assignment statement with the appropriate INFORMAT specified as the 2nd argument.
Scott Barry
SBBWorks, Inc.
Recommended Google advanced search argument, this topic/post:
data step variables numeric character length put input function site:sas.com
And, so, if you need to have a SAS NUMERIC type variable created from an incoming PUT function result, use the INPUT function in your assignment statement with the appropriate INFORMAT specified as the 2nd argument.
Scott Barry
SBBWorks, Inc.
Recommended Google advanced search argument, this topic/post:
data step variables numeric character length put input function site:sas.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I personally don't see a problem with your approach. There probably are other ways of accomplishing what you want to do but with out knowing the big picture its difficult to say.
I always ask myself if what I'm doing is straight forward enough that someone else would understand and from what I see, you have done this.
Glad I was of help
I always ask myself if what I'm doing is straight forward enough that someone else would understand and from what I see, you have done this.
Glad I was of help
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi:
The only quibble I have with this approach (of using PUT) is that PUT results in VAR1-VAR3 being created as CHARACTER variables, which can be proved by running a PROC CONTENTS step after the DATA step where the new variables are created using PUT.
This means that
1) EVERY time VAR1-VAR3 are used, SAS will have to convert them from CHARACTER to NUMERIC and
2) if any value of the original variables does not fall into the ranges, AND you stick with the PUT technique, then the out of range CHARACTER value becomes an asterisk '*' as shown in the program below, where one of the original values is 120 -- when the * gets converted to numeric, it will be missing, but, still, that might be unacceptable to an auditor of the program output (for one of the variables to be * and/or for the variable to be CHARACTER when it is being SUMMED).
The program below uses a format and then creates VAR1-VAR4, where one of the variable values is 120 and, thus, gets turned into an * using PUT.
cynthia
[pre]
proc format;
value FmtOne
1 - 25 = 4
26 - 50 = 3
51 - 75 = 2
76 - 100 = 1;
run;
data testform;
infile datalines;
input v1 v2 v3 v4;
format v1-v4 fmtone.;
return;
datalines;
22 27 54 79
23 28 55 80
12 30 60 120
;
run;
ods listing;
proc print data=testform;
title 'what is displayed without formats';
format v1-v4 ;
run;
proc print data=testform;
title 'what is displayed WITH formats';
title2 'internal values are unchanged';
format v1-v4 fmtone.;
run;
data testput; set testform;
var1=put(v1,fmtone.);
var2=put(v2,fmtone.);
var3=put(v3,fmtone.);
var4=put(v4,fmtone.);
Total = var1 + var2 + var3 + var4;
SumTot = sum(var1, var2, var3, var4);
run;
proc contents data=testput;
title 'Note that var1-var4 are CHARACTER variables';
run;
proc print data=testput;
title 'Using PUT statement -- show all ROWS';
run;
proc print data=testput;
title 'What if original v1-v4 number is not in range';
title2 'Then new variable becomes an *';
where var1 = '*' or var2 = '*' or var3 = '*' or var4 = '*';
format v1-v4;
run;
data test_input; set testform;
var1=input(put(v1,fmtone.),best.);
var2=input(put(v2,fmtone.),best.);
var3=input(put(v3,fmtone.),best.);
var4=input(put(v4,fmtone.),best.);
Total = var1 + var2 + var3 + var4;
** show difference between arithemetic with missing and sum stmt with missing;
SumTot = sum(var1, var2, var3, var4);
run;
proc contents data=test_input;
title 'Note that var1-var4 are NUMERIC variables';
run;
proc print data=test_input;
title 'Using PUT statement to make char var and then INPUT to make numeric';
run;
[/pre]
The only quibble I have with this approach (of using PUT) is that PUT results in VAR1-VAR3 being created as CHARACTER variables, which can be proved by running a PROC CONTENTS step after the DATA step where the new variables are created using PUT.
This means that
1) EVERY time VAR1-VAR3 are used, SAS will have to convert them from CHARACTER to NUMERIC and
2) if any value of the original variables does not fall into the ranges, AND you stick with the PUT technique, then the out of range CHARACTER value becomes an asterisk '*' as shown in the program below, where one of the original values is 120 -- when the * gets converted to numeric, it will be missing, but, still, that might be unacceptable to an auditor of the program output (for one of the variables to be * and/or for the variable to be CHARACTER when it is being SUMMED).
The program below uses a format and then creates VAR1-VAR4, where one of the variable values is 120 and, thus, gets turned into an * using PUT.
cynthia
[pre]
proc format;
value FmtOne
1 - 25 = 4
26 - 50 = 3
51 - 75 = 2
76 - 100 = 1;
run;
data testform;
infile datalines;
input v1 v2 v3 v4;
format v1-v4 fmtone.;
return;
datalines;
22 27 54 79
23 28 55 80
12 30 60 120
;
run;
ods listing;
proc print data=testform;
title 'what is displayed without formats';
format v1-v4 ;
run;
proc print data=testform;
title 'what is displayed WITH formats';
title2 'internal values are unchanged';
format v1-v4 fmtone.;
run;
data testput; set testform;
var1=put(v1,fmtone.);
var2=put(v2,fmtone.);
var3=put(v3,fmtone.);
var4=put(v4,fmtone.);
Total = var1 + var2 + var3 + var4;
SumTot = sum(var1, var2, var3, var4);
run;
proc contents data=testput;
title 'Note that var1-var4 are CHARACTER variables';
run;
proc print data=testput;
title 'Using PUT statement -- show all ROWS';
run;
proc print data=testput;
title 'What if original v1-v4 number is not in range';
title2 'Then new variable becomes an *';
where var1 = '*' or var2 = '*' or var3 = '*' or var4 = '*';
format v1-v4;
run;
data test_input; set testform;
var1=input(put(v1,fmtone.),best.);
var2=input(put(v2,fmtone.),best.);
var3=input(put(v3,fmtone.),best.);
var4=input(put(v4,fmtone.),best.);
Total = var1 + var2 + var3 + var4;
** show difference between arithemetic with missing and sum stmt with missing;
SumTot = sum(var1, var2, var3, var4);
run;
proc contents data=test_input;
title 'Note that var1-var4 are NUMERIC variables';
run;
proc print data=test_input;
title 'Using PUT statement to make char var and then INPUT to make numeric';
run;
[/pre]