BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
muskagap
Fluorite | Level 6

I need to loop through a dataset (built-in, training SAS table with data) and do some calculations based on it. First off, I don't know how I could loop through values in range of columns of the table. Second, need to loop through all values in every observation and calculate difference between previous and following value. Here is the example of a table:

Auto   Price1  Price2  Price3 
Mazda  35.000  40.500  38.000
Audi   70.000  60.000  80.000
BMW    110.000 85.000  93.000

Let's say my table is named Prices in SAS. I tried below but I know it's totally wrong. I have no idea how to start:

data data.Prices;

set myDataset.Prices;
do price_prev = Price1;
    do price_next = Price2;

    end;
    end;

    result = price_next-price_prev;
 run;

Desired output should be as follows:

Auto   Diff1   Diff2    Diff3 
Mazda  0       5.500    -2.500
Audi   0       -10.000  20.000
BMW    0       -25.000  8.000

I need do use some kind of a loop but don't know how. Important thing is that I can't use macros or some other complex synthax. Just loops.

1 ACCEPTED SOLUTION

Accepted Solutions
hashman
Ammonite | Level 13

@muskagap:

When you code Y_1950-Y_2000 in the array variable specification, you tell SAS to incorporate variables with the suffixes 1950 through 2000 inclusively incrementing them by 1. You cannot tell SAS to use a different increment. In your case, since all you variables are in the required sequence and all have the Y_ prefix, you can simply specify instead:

 

  array table1 Y_: ;

Note that if you have other PDV variables prefixed with Y_, it's not going to work, and you'll have to list all the array variables explicitly, i.e.:

 

 

  array table1 Y_1950 Y_1960 Y_1970 Y_1980 Y_1990 Y_2000 ;

 

I realize that if you have, say, 100 variables like these, it can be quite inconvenient. To avoid such tedious hard coding, you can construct your variable list before your DATA step commences its compilation and pass it via a macro variable or via executing a macro. For example:

 

data _null_ ;                                                                                                                           
  call symputx ("varlist", "") ;                                                                                                        
  do i = 1950 to 2000 by 10 ;                                                                                                           
    call symputx ("varlist", catx (" ", symget ("varlist"), catx ("_", "Y", i))) ;                                                      
  end ;                                                                                                                                 
run ;                                                                                                                                   
                                                                                                                                        
data your_step ;                                                                                                                        
  array table1 &varlist ;                                                                                                               
run ;      

Or, if you prefer:

 

%macro varlist (prefix, from, to, by) ;                                                                                                 
  %do i = &from %to &to %by &by ;                                                                                                       
    &prefix&i                                                                                                                           
  %end ;                                                                                                                                
%mend ;                                                                                                                                 
                                                                                                                                        
data _null_ ;                                                                                                                           
  array table1 %varlist(Y_,1950,2000,10) ;                                                                                              
run ;                   

Or, if you're an SQL fan:

proc sql noprint ;                                                                                                                      
  select "Y_" || put (1950 + 10 * (monotonic()-1), z4.) into :varlist separated by " " from sashelp.zipcode where monotonic() le 6 ;    
quit ;                                                                                                                                  
                                                                                                                                        
data your_step ;                                                                                                                        
  array table1 &varlist ;                                                                                                               
run ;      

NOTE, however: The array syntax above defines an implicitly subscripted array indexed by the automatic auto-dropped variable _I_, so in your DO looping code you must use _I_ instead of INDEX. (The same is true about your DIFF array.) If you don't want to use it, there're two ways around.

 

First: Tell SAS in the implicit array specification that you want the array indexed by INDEX:

 

  array table1 (index) Y_: ;

Second: Define an explicitly subscripted array:

 

 

array table1 [*] Y_: ;

Then you can use any numeric variable you want for its index.    

 

 

Kind regards

Paul D.

View solution in original post

6 REPLIES 6
Reeza
Super User

Here's a tutorial on using Arrays in SAS
https://stats.idre.ucla.edu/sas/seminars/sas-arrays/

 

SAS loops through each row automatically in a data step. If you don't understand how a data step works, that may the first thing to start with. https://documentation.sas.com/?docsetId=basess&docsetTarget=n053a58fwk57v7n14h8x7y7u34y4.htm&docsetV...

 


@muskagap wrote:

I need to loop through a dataset (built-in, training SAS table with data) and do some calculations based on it. First off, I don't know how I could loop through values in range of columns of the table. Second, need to loop through all values in every observation and calculate difference between previous and following value. Here is the example of a table:

Auto   Price1  Price2  Price3 
Mazda  35.000  40.500  38.000
Audi   70.000  60.000  80.000
BMW    110.000 85.000  93.000

Let's say my table is named Prices in SAS. I tried below but I know it's totally wrong. I have no idea how to start:

data data.Prices;

set myDataset.Prices;
do price_prev = Price1;
    do price_next = Price2;

    end;
    end;

    result = price_next-price_prev;
 run;

Desired output should be as follows:

Auto   Diff1   Diff2    Diff3 
Mazda  0       5.500    -2.500
Audi   0       -10.000  20.000
BMW    0       -25.000  8.000

I need do use some kind of a loop but don't know how. Important thing is that I can't use macros or some other complex synthax. Just loops.


 

 

muskagap
Fluorite | Level 6

Ok , tahnks. Need to switch from other programming languages to SAS, it works differently.

novinosrin
Tourmaline | Level 20

Hi @muskagap  One way to learn though may sound slow is to actually try and do without loops, and then understand what's going on to make it iterative leading to loops and key indexing. So I would suggest without loops first. Welcome to SAS communities

 

data have;
input Auto $  (Price1  Price2  Price3) (:comma10.) ;
format price: comma10.;
cards;
Mazda  35,000  40,500  38,000
Audi   70,000  60,000  80,000
BMW    110,000 85,000  93,000
;

data want;
set have;
Diff1=0;
Diff2=Price2-Price1;
Diff3=Price3-Price2;
format diff: comma10.;
run;
proc print noobs;run;
Reeza
Super User
Yeah, if you generally think of SAS processing a single row at a time that'll help a lot.
muskagap
Fluorite | Level 6

I have one more question: need to loop through columns (variables)  and calculate the difference between next and prior value. The table is as follows:

 

         Y_1950     Y_1960     Y_1970     Y_1980     Y_1990     Y_2000
          3281       3413       3114       2500       2700       3500
          4042       3084       3108       3150       3800       3100
          6015       6123       6113       6100       6100       6200

 

I created 2 arrays and wrote such a code:

 

data loopTables;

set sashelp.us_data;

 

ARRAY table1 Y_1950-Y_2000;
ARRAY diff diff1-diff5;

 

diff[1]=0;

do index = 2 to DIM(table1);
diff[index] = table1[index]-table1[index-1];

end;

 

run;

 

Unfortunately it doesn't work. I have 2 questions:

 

1. How can I associate columns names in array if their suffixes increase by 10? (1950, 1960 etc.)? I noticed that 1st array itself returns

Y_1950,Y_1951,Y_1952,Y_1953,Y_1954, etc. I need it to return variables as in my origin table.

 

2. Second array: if my first array returns Y_1950-Y_2000 how I should associate 'diff' with first array to return values from corresponding columns from first array? Sth like diff_1950-diff_2000 or diff_1-diff_5?

 

Would be grateful for any explanation.

hashman
Ammonite | Level 13

@muskagap:

When you code Y_1950-Y_2000 in the array variable specification, you tell SAS to incorporate variables with the suffixes 1950 through 2000 inclusively incrementing them by 1. You cannot tell SAS to use a different increment. In your case, since all you variables are in the required sequence and all have the Y_ prefix, you can simply specify instead:

 

  array table1 Y_: ;

Note that if you have other PDV variables prefixed with Y_, it's not going to work, and you'll have to list all the array variables explicitly, i.e.:

 

 

  array table1 Y_1950 Y_1960 Y_1970 Y_1980 Y_1990 Y_2000 ;

 

I realize that if you have, say, 100 variables like these, it can be quite inconvenient. To avoid such tedious hard coding, you can construct your variable list before your DATA step commences its compilation and pass it via a macro variable or via executing a macro. For example:

 

data _null_ ;                                                                                                                           
  call symputx ("varlist", "") ;                                                                                                        
  do i = 1950 to 2000 by 10 ;                                                                                                           
    call symputx ("varlist", catx (" ", symget ("varlist"), catx ("_", "Y", i))) ;                                                      
  end ;                                                                                                                                 
run ;                                                                                                                                   
                                                                                                                                        
data your_step ;                                                                                                                        
  array table1 &varlist ;                                                                                                               
run ;      

Or, if you prefer:

 

%macro varlist (prefix, from, to, by) ;                                                                                                 
  %do i = &from %to &to %by &by ;                                                                                                       
    &prefix&i                                                                                                                           
  %end ;                                                                                                                                
%mend ;                                                                                                                                 
                                                                                                                                        
data _null_ ;                                                                                                                           
  array table1 %varlist(Y_,1950,2000,10) ;                                                                                              
run ;                   

Or, if you're an SQL fan:

proc sql noprint ;                                                                                                                      
  select "Y_" || put (1950 + 10 * (monotonic()-1), z4.) into :varlist separated by " " from sashelp.zipcode where monotonic() le 6 ;    
quit ;                                                                                                                                  
                                                                                                                                        
data your_step ;                                                                                                                        
  array table1 &varlist ;                                                                                                               
run ;      

NOTE, however: The array syntax above defines an implicitly subscripted array indexed by the automatic auto-dropped variable _I_, so in your DO looping code you must use _I_ instead of INDEX. (The same is true about your DIFF array.) If you don't want to use it, there're two ways around.

 

First: Tell SAS in the implicit array specification that you want the array indexed by INDEX:

 

  array table1 (index) Y_: ;

Second: Define an explicitly subscripted array:

 

 

array table1 [*] Y_: ;

Then you can use any numeric variable you want for its index.    

 

 

Kind regards

Paul D.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 35309 views
  • 6 likes
  • 4 in conversation