BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
maroulator1
Fluorite | Level 6

To whom it may concern,

 

I have the dataset below (initial_data):

 

Obs CustomerID   Var1   Var2   Var3    Var4    Var5

  1      424535        5013   6784  5673    .        3462

  2      576775          .        4328  3231  4563   2445

  3      645678          .          .       4356    .        6534

  4      768395         6476  8732  6428   3542  2314

 

I want to increment each of the values in from Var1 through Var5 by 3000. I have the following code that I am hoping will help me achieve my goal, however, it doesn't seem to be working. Can someone please advise on what I still need to do on the code to achieve my objective?

 

data final_data;

   set initial_data;

   do i = Var1, Var2, Var3, Var4, Var5;

        i = i + 3000;

   end;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Doesn't work is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.

 

Since you don't say what "doesn't work" lets look at this code:

data final_data;
   set initial_data;
   do i = Var1, Var2, Var3, Var4, Var5;
        i = i + 3000;
   end;
run;

The I variable is assigned values sequentially set by 5 variables. So far so good. Then you add 3000 to each value of I.

The result is that I has the value of Var5 with 3000 added.

The variable on the LEFT side of an = gets the assigned value. Since nowhere did you have Var1, Var2, ... Var5 on the left of an = sign those variable values do not change.

 

The ARRAY statement as used in @SASKiwi's suggestion is a way to create a reference to those variables. The index value, the number that appears in the () after the array name is short had for "use the ith variable in the list defined as array vars".

Note that SAS will check to see that the index value is in the defined range for the array. So if your code has 5 variables and you ask for the 0 or 6th or 16th value in the list you will get an error.

View solution in original post

8 REPLIES 8
SASKiwi
PROC Star

I wouldn't bother with loops for only two variables. Try this:

data final_data;
  set initial_data;
  var1 = var1 + 3000;
  var2 = var2 + 3000;
run;

What happens with missing values?

maroulator1
Fluorite | Level 6

Thanks; but that was meant as illustrative example. I really need to use the do - loop on a variable list.

SASKiwi
PROC Star

In that case:

data final_data;
  set initial_data;
  array vars (*) var1-var2;
  do i= 1 to dim(vars);
    vars(i) = vars(i) + 3000;
  end;
run;
ballardw
Super User

Doesn't work is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.

 

Since you don't say what "doesn't work" lets look at this code:

data final_data;
   set initial_data;
   do i = Var1, Var2, Var3, Var4, Var5;
        i = i + 3000;
   end;
run;

The I variable is assigned values sequentially set by 5 variables. So far so good. Then you add 3000 to each value of I.

The result is that I has the value of Var5 with 3000 added.

The variable on the LEFT side of an = gets the assigned value. Since nowhere did you have Var1, Var2, ... Var5 on the left of an = sign those variable values do not change.

 

The ARRAY statement as used in @SASKiwi's suggestion is a way to create a reference to those variables. The index value, the number that appears in the () after the array name is short had for "use the ith variable in the list defined as array vars".

Note that SAS will check to see that the index value is in the defined range for the array. So if your code has 5 variables and you ask for the 0 or 6th or 16th value in the list you will get an error.

maroulator1
Fluorite | Level 6

Thanks all for your very helpful responses; they helped me frame the problem correctly in my mind.

 

Is it then fair to say that from the variables Var1 - Var5 in my dataset, if I only wanted to make changes to 3 of them, say Var1, Var3, and Var5 there is no way to do so other than putting Var1, Var3, and Var 5 into an array? In other words, is anyone aware of a way to modify my syntax below in order to take my Have dataset and get my Want dataset below? I have references to my syntax below in Base SAS textbooks, however, I have not come across any explicit enough examples to implement those references; everyone seems to default to using an array. As always, all thoughts are welcome.

 

My syntax:

data want;

  set have;

  do i = Var1, Var3, Var5;

       i = i + 3000;

  end;

run;

 

Have Dataset:

Obs CustomerID   Var1   Var2   Var3    Var4    Var5

  1      424535        5013   6784  5673    .        3462

  2      576775          .        4328  3231  4563   2445

  3      645678          .          .       4356    .        6534

  4      768395         6476  8732  6428   3542  2314

 

 

Want Dataset (Var1, Var3, Var5 have all be incremented by 3000):

Obs CustomerID   Var1   Var2   Var3    Var4    Var5

  1      424535        8013   6784  8673    .          6462

  2      576775        3000  4328   6231   4563    5445

  3      645678        3000    .        7356     .         9534

  4      768395        9476  8732   9428    3542   5314

ballardw
Super User

@maroulator1 wrote:

 

Is it then fair to say that from the variables Var1 - Var5 in my dataset, if I only wanted to make changes to 3 of them, say Var1, Var3, and Var5 there is no way to do so other than putting Var1, Var3, and Var 5 into an array? 


Not at all. What the ARRAY syntax allows is quick flexibility.

If you do something like

Array v (*) var1 var3 var5;
do i= 1 to dim(v);
   v[i]=sum(v[i], 3000);
end;

is equivalent to writing

var1= sum(var1,3000);
var3= sum(var3,3000);
var5= sum(var5,3000);

The flexibility of an array based approach is that consider your next data set you need to do similar has 15 variables. It is much easier to just replace the list in the Array definition with the "new" list of 15 variable names. None of the other code would need to change. However if using the the Var1= statements you would need to rewrite/add 15 statements. And then the time after that with 8 variable for example you would need to remove statements and make the variables match the new list.

 

You could even use something like

array V(*) var1-var5;
do i=1 to dim(v);
   if i in (1, 3, 5) then v[i] = sum(v[i], 3000);
end;

The IN operator returns true when i is any of the values in the list (note: not variables).

This would only do the addition with the index variable is the value of the element in the array that you want to modify. The STYLE choice of which code to use could depend. The last would be a better solution if you know that the variables will stay constant but that the variables that need that adjustment change frequently.

 

If you search this forum long enough I believe you will find a few questions/solutions involving arrays with 1000 or more variables.

 

Another use of the Arrays might be that you need to adjust the values differently for different variables. That might lead to a temporary array with the values used for the adjustment and code that looks like:

Array v (*) var1 var3 var5;
do i= 1 to dim(v);
   v[i]=sum(v[i], temp[i]);
end;

Where Temp has been properly defined to hold the constant values needed to adjust each variable.

 

Reeza
Super User

Here's a tutorial on using Arrays in SAS
https://stats.idre.ucla.edu/sas/seminars/sas-arrays/

 

Also, look at SUM function versus statement and how they handle missing and decide which you should be using.

https://communities.sas.com/t5/SAS-Programming/difference-between-sum-and-lt-operator-gt/td-p/392751

 

 

 

data want;
set have;
array _vars(5) var1-var5;

do i=1 to dim(_vars);
    _vars(i) = sum(_vars(i), 3000);
end;

run;
A_Kh
Lapis Lazuli | Level 10

2 things you need to consider is: 

- the same operation on multiple variables is done in ARRAYS, not in do loops itself

- an operation on missing values must be avoided either by using SAS functions ( eg. SUM), or by conditional processing (eg. if VAR ne missing then ..).

Below is the code how it works;

data have;
input obs cust var1 var2 var3 var4 var5;
datalines;
  1      424535        5013   6784  5673    .   3462
  2      576775          .    4328  3231  4563   2445
  3      645678          .    .       4356    .  6534
  4      768395         6476  8732  6428   3542  2314
  ;
  run; 

data want;
   set have;
   array x Var1 Var2 Var3 Var4 Var5;
   do over x;
   	if x ne . then x = x + 3000;
   end;
run;

As @Reeza suggested above, you might better check referenced documentations for learning more.. 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2250 views
  • 6 likes
  • 5 in conversation