rakeshallu Tracker
https://communities.sas.com/kntur85557/tracker
rakeshallu TrackerTue, 23 Apr 2024 17:43:35 GMT2024-04-23T17:43:35ZRe: Managing ODS outputs in IML
https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Managing-ODS-outputs-in-IML/m-p/892185#M6059
<P>This is exactly what I needed. Thank you, Rick!</P>Thu, 31 Aug 2023 21:05:46 GMThttps://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Managing-ODS-outputs-in-IML/m-p/892185#M6059rakeshallu2023-08-31T21:05:46ZManaging ODS outputs in IML
https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Managing-ODS-outputs-in-IML/m-p/891987#M6057
<P>Dear all, </P><P> </P><P>I know several variants of the following question have been asked before. However, I am still unable to incorporate the advice to my problem. Hence re-posting. I am running NLPNRA in a loop and would like to store the estimates and gradients in each iteration in a SAS dataset. The code that is being looped looks somewhat like this -</P><PRE><CODE class="">ods output ParameterEstimates = mle;
proc iml;
[OBJECTIVE FUNCTION]
quit; </CODE></PRE><P> When I run this code 1000 times in a loop, I see 1,000 HTML files created in my SAS Temporary Files directory. This chokes up my workspace and the subsequent iterations tend to take longer. My question is - how do I recover the file "mle" dataset in the code and delete the html file generated in the temporary files directory? </P><P> </P><P>I tried the following and the HTML files continue to be generated in all the attempts: </P><OL><LI>Add 'ods exclude all' and 'ods exclude none' before and after proc iml</LI><LI>Add 'ods noresults' and 'ods results'</LI><LI>Add dm 'odsresults; clear'; after executing proc iml</LI><LI>Add 'ods _all_ close' before executing proc iml</LI></OL><P>Any suggestions would be great. </P><P> </P><P>Thank you, </P><P>Rakesh</P>Thu, 31 Aug 2023 16:25:40 GMThttps://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Managing-ODS-outputs-in-IML/m-p/891987#M6057rakeshallu2023-08-31T16:25:40ZRe: Maximim Likelihood Estimation with by groups without a Do Loop
https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Maximim-Likelihood-Estimation-with-by-groups-without-a-Do-Loop/m-p/890531#M6047
<P>I did not do step 2 for the entire data. Below are the numbers for a sample of 10 unique groups - </P><P>1. Original problem took ~90 seconds</P><P>2. Using uniqueby reduces the time to ~65 seconds (I am sorting outside proc iml)</P><P>3. Vectorizing the inner loop reduces time to ~0.12 seconds</P><P> </P><P>That is very COOL! Thank you, Rick.</P>Wed, 23 Aug 2023 10:38:16 GMThttps://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Maximim-Likelihood-Estimation-with-by-groups-without-a-Do-Loop/m-p/890531#M6047rakeshallu2023-08-23T10:38:16ZRe: Maximim Likelihood Estimation with by groups without a Do Loop
https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Maximim-Likelihood-Estimation-with-by-groups-without-a-Do-Loop/m-p/890502#M6045
<P>Thank you very much, Rick. I am now able to execute the code for all the groups in about 45 minutes. </P><P> </P>Wed, 23 Aug 2023 02:31:05 GMThttps://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Maximim-Likelihood-Estimation-with-by-groups-without-a-Do-Loop/m-p/890502#M6045rakeshallu2023-08-23T02:31:05ZRe: Maximim Likelihood Estimation with by groups without a Do Loop
https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Maximim-Likelihood-Estimation-with-by-groups-without-a-Do-Loop/m-p/890444#M6042
<P>Thank you, Rick. </P><P>This helped me reduce the run time for 10 unique groups from 92 seconds to 69 seconds, a huge improvement!</P><P>However, the run-time per session is still unfortunately high to be scaled up to the entire data. Do you see any more opportunities to improve efficiency? </P><P> </P>Tue, 22 Aug 2023 17:28:00 GMThttps://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Maximim-Likelihood-Estimation-with-by-groups-without-a-Do-Loop/m-p/890444#M6042rakeshallu2023-08-22T17:28:00ZRe: MLE using a data step within PROC IML
https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/MLE-using-a-data-step-within-PROC-IML/m-p/890357#M6040
<P>Thank you, Rick. In light of this suggestion, we could put the submit/endsubmit thread to rest. </P><P>In the previous post, I use the native SAS IML module to compute the objective function. Any suggestions on improving it would be very helpful. </P><P> </P>Tue, 22 Aug 2023 10:45:18 GMThttps://communities.sas.com/t5/SAS-IML-Software-and-Matrix/MLE-using-a-data-step-within-PROC-IML/m-p/890357#M6040rakeshallu2023-08-22T10:45:18ZRe: MLE using a data step within PROC IML
https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/MLE-using-a-data-step-within-PROC-IML/m-p/890352#M6038
<P>Thank you, Rick and Tom. </P><P> </P><P>I see that symget works in the ml environment. However, in the line highlighted by Tom "C = &total_ll.;", the macro variable is not getting updated with each iteration of nlpnra. It simply retains the value before the start of proc iml. </P><P> </P><P>Any suggestions on how do I update this variable with nlpnra? </P>Tue, 22 Aug 2023 10:35:13 GMThttps://communities.sas.com/t5/SAS-IML-Software-and-Matrix/MLE-using-a-data-step-within-PROC-IML/m-p/890352#M6038rakeshallu2023-08-22T10:35:13ZRe: MLE using a data step within PROC IML
https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/MLE-using-a-data-step-within-PROC-IML/m-p/890313#M6035
Thank you for the response.<BR />We can do a data step within proc iml..<BR />See Rick's video below -<BR /><A href="https://blogs.sas.com/content/iml/2011/10/24/video-calling-sas-procedures-from-the-sasiml-language.html" target="_blank">https://blogs.sas.com/content/iml/2011/10/24/video-calling-sas-procedures-from-the-sasiml-language.html</A><BR /><BR />Get Outlook for Android<>Mon, 21 Aug 2023 23:03:41 GMThttps://communities.sas.com/t5/SAS-IML-Software-and-Matrix/MLE-using-a-data-step-within-PROC-IML/m-p/890313#M6035rakeshallu2023-08-21T23:03:41ZMLE using a data step within PROC IML
https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/MLE-using-a-data-step-within-PROC-IML/m-p/890311#M6033
<P>Dear all, </P><P>Apologies for the multiple posts. I wanted to share the progress I was able to make with the previous question. In case it helps anyone help me.</P><P> </P><P>I used 'submit' and 'endsubmit' to convert all the vector operations in my previous post into data steps. This enabled me to get rid of the Do-loop. However, with the code below, nlpnra is simply returning the initial value. Any suggestions on what am I doing wrong will be very helpful. </P><P> </P><P>Thank you very much in advance. </P><PRE><CODE class="">proc iml;
start LogLik(param);
p1 = param[1];
p2 = param[2];
p3 = param[3];
p4 = param[4];
p5 = param[5];
submit p1 p2 p3 p4 p5;
data _null_;
set want;
/*scroll further probability*/
if position < 3 then sc_p = 1;
else sc_p = logistic(&p1 + &p2*position + &p3*pred_usage + &p4*relevant_offer_cnt_wt + &p5*credits_r);
/*end probability*/
if position < max_pos then ed = 0;
else ed = 1 - sc_p;
by seq;
retain prod_choice;
retain prod_scroll;
retain ll;
retain total_ll 0;
if first.seq then do;
prod_choice = 1;
prod_scroll = 1; end;
prod_choice = prod_choice*not_choosing_prob;
prod_scroll = prod_scroll*sc_p;
prod_choice_end = prod_choice*ed;
lag_prod_scroll = lag(prod_scroll);
if first.seq then lag_prod_scroll = 1;
pos_end_ll = lag_prod_scroll*prod_choice_end;
if first.seq then ll = pos_end_ll;
else ll = ll + pos_end_ll;
if last.seq then do;
log_ll = log(ll);
total_ll = total_ll + log_ll;
end;
call symput ("total_ll",total_ll);
run;
endsubmit;
C = &total_ll.;
return sum(C);
finish;
param = {-0.1 -0.005 0.1 -0.1 0.2 0.0};
optn = {1, /* 3. find max of function, and */ 4}; /* print moderate amount of output */
con = {. . . . . .,
. . . . . .};
call nlpnra(rc, xres, "LogLik", param, optn, con);
quit; </CODE></PRE><P>Best, </P><P>Rakesh</P>Mon, 21 Aug 2023 22:28:49 GMThttps://communities.sas.com/t5/SAS-IML-Software-and-Matrix/MLE-using-a-data-step-within-PROC-IML/m-p/890311#M6033rakeshallu2023-08-21T22:28:49ZMaximim Likelihood Estimation with by groups without a Do Loop
https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Maximim-Likelihood-Estimation-with-by-groups-without-a-Do-Loop/m-p/890288#M6032
<P>Hello all, </P><P> </P><P>I am trying to estimate parameters through a custom-written likelihood function in PROC IML. The input data to the function ("use" in the code below) is akin to a panel with multiple observations for each unit (vector 'u' in the code below stores all the unique units). I compute the likelihood of observing the data for each unit (I call it LL in the code below). Then, I compute the log(LL) and sum log(LL) over all units. The summation is my objective function.</P><P> </P><P>The code works fine for small datasets. For 1000 rows and 42 unique units in the dataset, it takes about a minute to execute. However, I have about 2 MN rows and 150K unique units in the data and each time I try executing, SAS gets hung. I even tried letting it run for 36 hours with no output. My guess is that the DO loop over all unique units is causing the time. My questions are as follows - </P><P> </P><P>1. Is my guess correct? </P><P>2. If yes, do you have some advice on doing these computations without a DO Loop?</P><P>3. If no, do you have some advice on suggestions to increase the speed? </P><P> </P><P>As always, very grateful for all your support. </P><P> </P><P>Thank you, </P><P>Rakesh </P><P> </P><PRE><CODE class="">%let extended_pos = 5;
proc iml;
/************************/
/*Read relevant variables from the data into a matrix X*/
varNames = {"seq" "position" "pred_usage" "residual_usage" "credits_r" "p_hat" "c_seller_fe" "c_pos_coef" "sl_fixed_effect" "pos_fixed_effect"};
use use;
read all var varNames into x;
close use ;
/************************/
start LogLik(param) global(x);
u = unique(x[,1]); /*Extract unique session Ids in vector u*/
s = j(1, ncol(u),.); /*Create an empty matrix to store probability of each session*/
do i = 1 to ncol(u);
idx = loc(x[,1]=u[i]); /*Get all the variables in each session*/
max_pos = max(x[,2][idx]`);
sess_pred_usage = x[,3][idx]`;
sess_res_usage = x[,4][idx]`;
sess_credits = max(x[,5][idx]`);
sess_phat = max(x[,6][idx]`);
sl_fe_choice = max(x[,7][idx]`);
sl_pos_choice = max(x[,8][idx]`);
sl_fe_scroll = max(x[,9][idx]`);
sl_pos_scroll = max(x[,10][idx]`);
max_pred_usage = sess_pred_usage[1,max_pos];
max_res_usage = sess_res_usage[1,max_pos];
pos_cut = max_pos + &extended_pos.;
pos = do(1, pos_cut, 1);
/************************/
/*Create the predicted and residual upto vectors for all positions upto the cut off*/
sess_pred_usage = sess_pred_usage||(max_pred_usage*j(1, &extended_pos.,1));
sess_res_usage = sess_res_usage||(max_res_usage*j(1, &extended_pos.,1));
/************************/
/************************/
/*Create a vector for probability of scrolling further at all positions
The seller always scrolls further from the first two positions*/
sc = logistic(sl_fe_scroll + sl_pos_scroll*pos + param[1] + param[2]*pos + param[3]*sess_pred_usage + param[4]*sess_res_usage + param[5]*sess_credits);/*param[2]*predicted_usage_prop + param[3]*residual_usage_prop +*/
sc[1,1] = 1; sc[1,2] = 1;
/************************/
/************************/
/*Create a vector for probability of ending at all positions greater then the maximum position
Up to the maximum position, the probability of ending is 0*/
ed = 1 - sc; ed[1, 1:max_pos-1] = 0;
/************************/
/************************/
/*Create a vector for computing the not choosing probality at each position.
Upto max_pos default this value to 1 as it does not affect the LL function. */
m1 = 1 - logistic(sl_fe_choice + sess_phat + sl_pos_choice*(max_pos + do(1, &extended_pos.,1)));
not_choosing_prob = j(1, max_pos,1)||m1;
/************************/
/************************/
/*Compute the probability of a sample path for all possible end points in a session. Store each probability in a vector called LL*/
LL = j(1, pos_cut,.);
LL[1,1] = 0; LL[1,2] = 0;
do m = 3 to pos_cut;
LL[1,m] = ed[1,m]*prod(sc[1,1:m-1])*prod(not_choosing_prob[1,1:m]);
end;
/************************/
s[i] = log(100*sum(LL)); /*Probability of aN OBSERVED session is the sum of all sample paths */
end;
return sum(s);
finish;
/************************/
/*Maximize sum(s)*/
param = {0 -0.0025 0.001 0.001 0};
optn = {1, /* 3. find max of function, and */ 4}; /* print moderate amount of output */
con = {. . . . .,
. . . . .};
call nlpnra(rc, xres, "LogLik", param, optn, con);
/************************/
quit; </CODE></PRE><P> </P><P> </P><P> </P><P> </P>Mon, 21 Aug 2023 19:47:24 GMThttps://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Maximim-Likelihood-Estimation-with-by-groups-without-a-Do-Loop/m-p/890288#M6032rakeshallu2023-08-21T19:47:24Z