BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
mango_tango_598
Fluorite | Level 6

Hello,

 

My dataset has 1000 columns (named 'Simulation1', 'Simulation2',------, 'Simulation1000'), and each column has 28 rows. I obtain the SAS code to calculate the product row for column 1 (Simulation1).  I would like to do the same calculation for each 1000 column. Please help me to set up those calculation. Thank you very much!

 

/* DATA step method to compute the product down the rows */
data want1;
retain Prod 1 n 0; /* n = number of nonmissing */
set have end=EOF;
if ^missing(Simulation1) then do;
n + 1;
Prod = Prod * Simulation1;
end;
if EOF then do;
if n=0 then Prod=.;
put n= / prod=;
end;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You could use arrays.

data want;
  set have end=eof;
  array simulation simulation1-simulation1000;
  array product [1000] _temporary_ (1000*1);
  do index=1 to dim(simulation);
     product[index]=product[index]*simulation[index];
  end;
  if eof then do;
     do index=1 to dim(simulation);
       simulation[index]=product[index];
     end;
     n = _n_;
     output;
  end;
  drop index;
run;

You could also convert the values to the log() and then SUM them (using PROC SUMMARY for example).  You can then convert them back into raising 10 to the power of the sum of the logs.

View solution in original post

6 REPLIES 6
Tom
Super User Tom
Super User

You could use arrays.

data want;
  set have end=eof;
  array simulation simulation1-simulation1000;
  array product [1000] _temporary_ (1000*1);
  do index=1 to dim(simulation);
     product[index]=product[index]*simulation[index];
  end;
  if eof then do;
     do index=1 to dim(simulation);
       simulation[index]=product[index];
     end;
     n = _n_;
     output;
  end;
  drop index;
run;

You could also convert the values to the log() and then SUM them (using PROC SUMMARY for example).  You can then convert them back into raising 10 to the power of the sum of the logs.

mango_tango_598
Fluorite | Level 6

Thank you so much! Your script worked!!

mkeintz
PROC Star

You can make use of a _TEMPORARY_ array, which is automatically retained, and can hold your product values.  For the last obs copy that array back to the original variables:

 

data want (drop=i);
  set have  end=end_of_have;

  array prod {1000} _temporary_;
  array sim  {1000} simulation1-simulation1000;

  do i=1 to 1000;
    if not missing(sim{i}) then prod{i}=sim{i}*coalesce(prod{i},1);
  end;

  if end_of_have;
  do i=1 to 1000;
    sim{i}=prod{i};
  end;
run;

This assumes you have 1,000 variables named simulation1 - simulation1000.

 

The reason for the "coalesce(prod{i},1)", is to make sure that the first multiplication is not applied to the missing value that initializes the PROD array.  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mango_tango_598
Fluorite | Level 6

Thank you so much! Your script also worked too!

mkeintz
PROC Star

Just to round out the available techniques, there is a reasonable array-less solution:

 

proc transpose data=have out=need;
run;
data need_product (keep=_name_ product);
  set need;
  if n(of col:)>0 then product=geomean(of col:)**n(of col:);
run;
proc transpose data=need_product out=want (drop=_name_);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
/*
It is SAS/IML thing.
*/
data have;
array x{*} Simulation1-Simulation1000;
call streaminit(123);
do n=1 to 28;
 do i=1 to dim(x);
   x{i}=rand('integer',1,4);
   if rand('bern',0.2)=1 then call missing(x{i});
 end;
output;
end;
keep Simulation: ;
run;

proc iml;
use have;
read all var _num_ into x[c=vnames];
close;
want=j(nrow(x),ncol(x),.);
do i=1 to ncol(x);
 want[,i]=cuprod(x[,i]);
end;
create want from want[c=vnames];
append from want;
close;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1067 views
  • 2 likes
  • 4 in conversation