BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Traian
Fluorite | Level 6

I am trying to create a new variable that will represent the sum of 2 other columns for each row.

The issue i am having is the following. My original data looks like this:

A    B    sum

3    4    7

5    5    10

3    4    7

 

I must change the values of A and B based on certain criteria and afterwards calculate the sum. So for example after the FORMAT i get this.

 

PROC FORMAT;
value test
3 = 0.5
4 = 0.7
5 = 1
;
run;

data have_f;
set have;
Format A B test.;
run;

data want;
set have_f;
sum = sum(of A B);
run;

A     B     sum

0.5  0.7   7

1     1      10

0.5  0.7    7

 

How can i calculate the sum of the formatted values instead of the original ones?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Formats are used to control how values are display as text

You can use them with the PUT() function to help you with transforming values, but if you want the result to be a number you will need to add an INPUT() function call to convert the generated text into a number.

data have;
  input A B ;
cards;
3    4
5    5
3    4
;

PROC FORMAT;
value test
 3 = 0.5
 4 = 0.7
 5 = 1
;
run;

data want;
  set have;
  array old A B ;
  array new new_A new_B ;
  do index=1 to dim(old);
    new[index] = input(put(old[index],test.),32.);
  end;
  sum=sum(of old[*]);
  new_sum=sum(of new[*]);
  drop index;
run;

proc print;
run;
Obs    A    B    new_A    new_B    sum    new_sum

 1     3    4     0.5      0.7       7      1.2
 2     5    5     1.0      1.0      10      2.0
 3     3    4     0.5      0.7       7      1.2

View solution in original post

8 REPLIES 8
Traian
Fluorite | Level 6

The complete format also has 1 and 2 but both of those are turned into 0. I have to apply this format to around 20 variables and eventually calculate multiple sums between the formatted 2 to 6 variables.

Tom
Super User Tom
Super User

Formats are used to control how values are display as text

You can use them with the PUT() function to help you with transforming values, but if you want the result to be a number you will need to add an INPUT() function call to convert the generated text into a number.

data have;
  input A B ;
cards;
3    4
5    5
3    4
;

PROC FORMAT;
value test
 3 = 0.5
 4 = 0.7
 5 = 1
;
run;

data want;
  set have;
  array old A B ;
  array new new_A new_B ;
  do index=1 to dim(old);
    new[index] = input(put(old[index],test.),32.);
  end;
  sum=sum(of old[*]);
  new_sum=sum(of new[*]);
  drop index;
run;

proc print;
run;
Obs    A    B    new_A    new_B    sum    new_sum

 1     3    4     0.5      0.7       7      1.2
 2     5    5     1.0      1.0      10      2.0
 3     3    4     0.5      0.7       7      1.2
Traian
Fluorite | Level 6
I see, i completely missunderstood what formats did. I used one to convert some of my data into probability and apply pca to it and now realized that it used the original values to establish the principal components. If this is the case, is there an easy way to just completely modify the data based on that format? As inthis set:
A B
5 5
4 5
3 5

becomes this set:
A B
1 1
0.7 1
0.5 1
Don't need to keep the original values in this case.
Tom
Super User Tom
Super User

Just write the result back into the same variable.

a=input(put(a,test.),32.);

Make sure to not overwrite your source dataset if you do this.

Tom
Super User Tom
Super User

Note that if you are reading the original data from a text file you could define an INFORMAT instead of FORMAT.  You use an INFORMAT to convert text into values.  You use the INVALUE statement to define an INFORMAT.

PROC FORMAT;
invalue test
 '3' = 0.5
 '4' = 0.7
 '5' = 1
;
run;


data have;
  input (A B)(:test.);
cards;
3    4
5    5
3    4
;
Obs     A      B

 1     0.5    0.7
 2     1.0    1.0
 3     0.5    0.7

Traian
Fluorite | Level 6

I see, i tried adapting your answer to my dataset..

data want;
  set Alcool_A_Q3;
  array prob a1 a4 a6 a7 a8;
  do index=1 to dim(prob);
    prob[index] = input(put(prob[index],los.),32.);
  end;
  drop index;
run;

I tried it out on 5 variables and it works fine. I wanted to know if there was a way to pass the variables to the array by their index in the dataset so i avoid typing them all( if there is an option like that to begin with). The variable names become very complex at some point so it'll be hard to make sure i didn't make any typos. Is there a simple way to pass columns from 3rd to 23rd to the array? Also thank you for the answers, it helped me tremenduously.

 

Tom
Super User Tom
Super User

You can use a number of different variable lists.

_all_ means ALL of the variables, probably not useful for this as it might include character variables.

_numeric_ means all of the numeric variables.

_character_ means all of the character variables.

If you name the variables with sequential numeric suffix you can use a simple variable list:  a1-a8

If you know the first and last variable name in your list you can use positional variable list:  a1--a8

You can even pick only the numeric variables in a positional range:  a1-numeric-a8

You can also pick all variables whose  name starts with a common prefix by using a colon.  a:

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 667 views
  • 2 likes
  • 3 in conversation