BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
mango_tango_598
Fluorite | Level 6

Hello,

 

My dataset has 1000 columns (named 'Simulation1', 'Simulation2',------, 'Simulation1000'), and each column has 28 rows. I obtain the SAS code to calculate the product row for column 1 (Simulation1).  I would like to do the same calculation for each 1000 column. Please help me to set up those calculation. Thank you very much!

 

/* DATA step method to compute the product down the rows */
data want1;
retain Prod 1 n 0; /* n = number of nonmissing */
set have end=EOF;
if ^missing(Simulation1) then do;
n + 1;
Prod = Prod * Simulation1;
end;
if EOF then do;
if n=0 then Prod=.;
put n= / prod=;
end;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You could use arrays.

data want;
  set have end=eof;
  array simulation simulation1-simulation1000;
  array product [1000] _temporary_ (1000*1);
  do index=1 to dim(simulation);
     product[index]=product[index]*simulation[index];
  end;
  if eof then do;
     do index=1 to dim(simulation);
       simulation[index]=product[index];
     end;
     n = _n_;
     output;
  end;
  drop index;
run;

You could also convert the values to the log() and then SUM them (using PROC SUMMARY for example).  You can then convert them back into raising 10 to the power of the sum of the logs.

View solution in original post

6 REPLIES 6
Tom
Super User Tom
Super User

You could use arrays.

data want;
  set have end=eof;
  array simulation simulation1-simulation1000;
  array product [1000] _temporary_ (1000*1);
  do index=1 to dim(simulation);
     product[index]=product[index]*simulation[index];
  end;
  if eof then do;
     do index=1 to dim(simulation);
       simulation[index]=product[index];
     end;
     n = _n_;
     output;
  end;
  drop index;
run;

You could also convert the values to the log() and then SUM them (using PROC SUMMARY for example).  You can then convert them back into raising 10 to the power of the sum of the logs.

mango_tango_598
Fluorite | Level 6

Thank you so much! Your script worked!!

mkeintz
Jade | Level 19

You can make use of a _TEMPORARY_ array, which is automatically retained, and can hold your product values.  For the last obs copy that array back to the original variables:

 

data want (drop=i);
  set have  end=end_of_have;

  array prod {1000} _temporary_;
  array sim  {1000} simulation1-simulation1000;

  do i=1 to 1000;
    if not missing(sim{i}) then prod{i}=sim{i}*coalesce(prod{i},1);
  end;

  if end_of_have;
  do i=1 to 1000;
    sim{i}=prod{i};
  end;
run;

This assumes you have 1,000 variables named simulation1 - simulation1000.

 

The reason for the "coalesce(prod{i},1)", is to make sure that the first multiplication is not applied to the missing value that initializes the PROD array.  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mango_tango_598
Fluorite | Level 6

Thank you so much! Your script also worked too!

mkeintz
Jade | Level 19

Just to round out the available techniques, there is a reasonable array-less solution:

 

proc transpose data=have out=need;
run;
data need_product (keep=_name_ product);
  set need;
  if n(of col:)>0 then product=geomean(of col:)**n(of col:);
run;
proc transpose data=need_product out=want (drop=_name_);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
/*
It is SAS/IML thing.
*/
data have;
array x{*} Simulation1-Simulation1000;
call streaminit(123);
do n=1 to 28;
 do i=1 to dim(x);
   x{i}=rand('integer',1,4);
   if rand('bern',0.2)=1 then call missing(x{i});
 end;
output;
end;
keep Simulation: ;
run;

proc iml;
use have;
read all var _num_ into x[c=vnames];
close;
want=j(nrow(x),ncol(x),.);
do i=1 to ncol(x);
 want[,i]=cuprod(x[,i]);
end;
create want from want[c=vnames];
append from want;
close;
quit;

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 233 views
  • 2 likes
  • 4 in conversation