I am trying to create a new variable that will represent the sum of 2 other columns for each row.
The issue i am having is the following. My original data looks like this:
A B sum
3 4 7
5 5 10
3 4 7
I must change the values of A and B based on certain criteria and afterwards calculate the sum. So for example after the FORMAT i get this.
PROC FORMAT; value test 3 = 0.5 4 = 0.7 5 = 1 ; run; data have_f; set have; Format A B test.; run; data want; set have_f; sum = sum(of A B); run;
A B sum
0.5 0.7 7
1 1 10
0.5 0.7 7
How can i calculate the sum of the formatted values instead of the original ones?
Formats are used to control how values are display as text.
You can use them with the PUT() function to help you with transforming values, but if you want the result to be a number you will need to add an INPUT() function call to convert the generated text into a number.
data have;
input A B ;
cards;
3 4
5 5
3 4
;
PROC FORMAT;
value test
3 = 0.5
4 = 0.7
5 = 1
;
run;
data want;
set have;
array old A B ;
array new new_A new_B ;
do index=1 to dim(old);
new[index] = input(put(old[index],test.),32.);
end;
sum=sum(of old[*]);
new_sum=sum(of new[*]);
drop index;
run;
proc print;
run;
Obs A B new_A new_B sum new_sum 1 3 4 0.5 0.7 7 1.2 2 5 5 1.0 1.0 10 2.0 3 3 4 0.5 0.7 7 1.2
How many values does the "format" need to accommodate? Is it just 3,4,5, or are there more?
The complete format also has 1 and 2 but both of those are turned into 0. I have to apply this format to around 20 variables and eventually calculate multiple sums between the formatted 2 to 6 variables.
Formats are used to control how values are display as text.
You can use them with the PUT() function to help you with transforming values, but if you want the result to be a number you will need to add an INPUT() function call to convert the generated text into a number.
data have;
input A B ;
cards;
3 4
5 5
3 4
;
PROC FORMAT;
value test
3 = 0.5
4 = 0.7
5 = 1
;
run;
data want;
set have;
array old A B ;
array new new_A new_B ;
do index=1 to dim(old);
new[index] = input(put(old[index],test.),32.);
end;
sum=sum(of old[*]);
new_sum=sum(of new[*]);
drop index;
run;
proc print;
run;
Obs A B new_A new_B sum new_sum 1 3 4 0.5 0.7 7 1.2 2 5 5 1.0 1.0 10 2.0 3 3 4 0.5 0.7 7 1.2
Just write the result back into the same variable.
a=input(put(a,test.),32.);
Make sure to not overwrite your source dataset if you do this.
Note that if you are reading the original data from a text file you could define an INFORMAT instead of FORMAT. You use an INFORMAT to convert text into values. You use the INVALUE statement to define an INFORMAT.
PROC FORMAT;
invalue test
'3' = 0.5
'4' = 0.7
'5' = 1
;
run;
data have;
input (A B)(:test.);
cards;
3 4
5 5
3 4
;
Obs A B 1 0.5 0.7 2 1.0 1.0 3 0.5 0.7
I see, i tried adapting your answer to my dataset..
data want;
set Alcool_A_Q3;
array prob a1 a4 a6 a7 a8;
do index=1 to dim(prob);
prob[index] = input(put(prob[index],los.),32.);
end;
drop index;
run;
I tried it out on 5 variables and it works fine. I wanted to know if there was a way to pass the variables to the array by their index in the dataset so i avoid typing them all( if there is an option like that to begin with). The variable names become very complex at some point so it'll be hard to make sure i didn't make any typos. Is there a simple way to pass columns from 3rd to 23rd to the array? Also thank you for the answers, it helped me tremenduously.
You can use a number of different variable lists.
_all_ means ALL of the variables, probably not useful for this as it might include character variables.
_numeric_ means all of the numeric variables.
_character_ means all of the character variables.
If you name the variables with sequential numeric suffix you can use a simple variable list: a1-a8
If you know the first and last variable name in your list you can use positional variable list: a1--a8
You can even pick only the numeric variables in a positional range: a1-numeric-a8
You can also pick all variables whose name starts with a common prefix by using a colon. a:
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.