Hi,
I hope I can explain what I want. It is easy to do it in Excell but as a beginner SAS user it is hard for me. I want to write an equation as below with the data as seen in the picture.
p_m= (ex1_1/sum_ex1)*(ex2_1/sum_ex2)+(ex1_2/sum_ex1)*(ex2_2/sum_ex2)+(ex1_3/sum_ex1)*(ex2_3/sum_ex2)+(ex1_4/sum_ex1)*(ex2_4/sum_ex2)
This equation includes only 4 variables but I may have more than 4 variables. I guess an array will be useful for this case. I tried to write a code but something is wrong with that. The error I get is " Array subscript out of range at....". My code is also below.
Could you help to write this equation in SAS?
Thanks
data all_pars1;
set all_pars;
array t {3} theta1-theta3 ;
array ex {3,4} &vlist;
array s {4};
array inc {4};
array sumex{3} 8.;
array p_m{1} 8.;
do i=1 to 3;
do j=1 to 4;
ex(i,j)=exp(t(i)*s(j)+inc(j));
sumex{i}=sum(sumex{i},ex{i,j});
p_m(i)=sum((ex{i,j}/sumex{i})*(ex{i,j}/sumex{j}));
end;
end;
run;
p_m= (ex1_1/sum_ex1)*(ex2_1/sum_ex2)+(ex1_2/sum_ex1)*(ex2_2/sum_ex2)+(ex1_3/sum_ex1)*(ex2_3/sum_ex2)+(ex1_4/sum_ex1)*(ex2_4/sum_ex2)
This equation includes only 4 variables but I may have more than 4 variables.
The formula you posted appears to have more that 4 variables. Looks like 10 variables to me.
1 data names ; 2 length name $32 ; 3 infile cards dlm='/()*+- '; 4 input name @@; 5 cards; NOTE: SAS went to a new line when INPUT statement reached past the end of a line. NOTE: The data set WORK.NAMES has 16 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.18 seconds cpu time 0.09 seconds 7 ;;;; 8 9 proc sort nodupkey; by name; run; NOTE: There were 16 observations read from the data set WORK.NAMES. NOTE: 6 observations with duplicate key values were deleted. NOTE: The data set WORK.NAMES has 10 observations and 1 variables.
Explain what each of the 10 variables in your equation are and how they relate the data.
Post sample data (as a SAS data step, not a photograph) and what your expect result would be for that sample data.
Hi Tom,
Thanks for your response.
Here is my whole code and the input files are attached. The equation I want is
p_m=(ex1_1/sum_ex1)*(ex2_1/sum_ex2)+(ex1_2/sum_ex1)*(ex2_2/sum_ex2)+(ex1_3/sum_ex1)*(ex2_3/sum_ex2)+(ex1_4/sum_ex1)*(ex2_4/sum_ex2)
As I said I will have more variables than I have in this equation.
Many many thanks!
data par;
infile 'C:\cluster_new\mlg1.txt';
input a1 a2 a3 b1 b2 b3 ;
run;
data score;
infile 'C:\cluster_new\mlgs.txt';
input theta1 theta2 theta3;
run;
data all_pars;
if _n_=1 then set score;
set par;
run;
data all_pars;
set all_pars;
s1=-(a1+a2+a3)/4;
s2=s1+a1;
s3=s1+a2;
s4=s1+a3;
inc1=-(b1+b2+b3)/4;
inc2=inc1+b1;
inc3=inc1+b2;
inc4=inc1+b3;
run;
data _null_;
length c $16000; /* 500*4*(up to 😎 characters: " ex123_4" */
do i=1 to 3;
do j=1 to 4;
c=catx(' ',c,cats('ex',i,'_',j));
end;
end;
call symputx('vlist',c);
run;
data all_pars1;
set all_pars;
array t {3} theta1-theta3 ;
array ex {3,4} &vlist;
array s {4};
array inc {4};
array sumex{3} 8.;
do i=1 to 3;
do j=1 to 4;
ex(i,j)=exp(t(i)*s(j)+inc(j));
sumex{i}=sum(sumex{i},ex{i,j});
end;
end;
run;
data all_pars1;
set all_pars;
array t {3} theta1-theta3 ;
array ex {3,4} &vlist;
array s {4};
array inc {4};
array sumex{3} 8.;
array p_m{1} 8.;
do i=1 to 3;
do j=1 to 4;
ex(i,j)=exp(t(i)*s(j)+inc(j));
sumex{i}=sum(sumex{i},ex{i,j});
p_m(i)=sum((ex{i,j}/sumex{i})*(ex{i,j}/sumex{j}));
end;
end;
run;
The extra code without any explanation is really too much to look at. I still do not see the relationship between the variables in the equation you posted and the variables in the data you posted.
Let me try to explain with my poor English what I want. Actually, if you pull the input files which I shared in previous message, it may be helpful. Then you will see the variables I created. The code to create those variables are also in the previous message. I want to write this equation given below with the variables I created.
p_m=(ex1_1/sum_ex1)*(ex2_1/sum_ex2)+(ex1_2/sum_ex1)*(ex2_2/sum_ex2)+(ex1_3/sum_ex1)*(ex2_3/sum_ex2)+(ex1_4/sum_ex1)*(ex2_4/sum_ex2).
I hope this was helpful.Thanks
@Tom, OP only wants to translate that long equation into a nested loop sum.
Yes, thanks @PGStats for helping me to clarify what I need.
So clarify what your equation actually means.
p_m= (ex1_1/sum_ex1)*(ex2_1/sum_ex2)+(ex1_2/sum_ex1)*(ex2_2/sum_ex2)+(ex1_3/sum_ex1)*(ex2_3/sum_ex2)+(ex1_4/sum_ex1)*(ex2_4/sum_ex2)
For example: Does SUM_EX1 mean that you want the sum of something? If so what?
Provide a SMALL sample of data (NOT your original messy data that needs to go thru multiple steps to get into a usable form) and also the result for that data. Post the data in the form of a DATA step with IN LINE data, not with attached data files. Something like this:
data have ;
input id x1 x2 x3 ;
cards;
1 1 2 3
2 4 5 6
;
If you can't think of the right words to explain what SUM_EX1 means then write out the values you want to sum to create it. Make sure to use numbers in your example data that will make it clear which values you are using. For example in the trivial dataset I just posted perhaps by SUM_EX1 you mean 1 plus 4?
Then describe how you need to be able to expand the example for your real problem. It sounds like you might be thinking about this in terms of an array of some sort. How many dimensions? Which of them will increase in your real problem and which are fixed?
data have;
First thing I see, you refer to p_m(i) where i goes from 1 to 3, but array p_m has dimension 1.
Hmm, yes you are right but still it doesn't work. It seems I'm missing more than one thing.
Well, you also have sumex{j}, with j going from 1 to 4 but array sumex having only dimension 3.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.