Hi Guys
I have an issue when I'm running this code of my dataset. which sas giving (Not responding) after a few minutes by running this code, while when I was running the same code for small dataset had great results. Dataset has 218006 observations and 28 variables.
What do you think the best way to solve this problem?
PROC SORT DATA = NEW;
BY cow_id pr;
RUN;
proc nlin data = new outest=ESTIMATES;
parms A = 15 B = 0.19 C = -0.0012 ;
bounds A B C > 0;
by cow_id pr;
model TEST_DAY_MILK_KG = A * Time **b * exp(-C*Time);
output out = Fit predicted = Pred ;
run;
proc sql;
create table persistency as
select
cow_id,pr,
A, B, C,
-(B+1)*LOG(C) as P,
A * (B/C)**(B) * exp(-B) as peakYield,
(B/C) as tmdays
from ESTIMATES
where _TYPE_ = "FINAL";
select * from persistency;
run;
quit;
data merged;
merge new persistency;
by cow_id pr;
run;
PROC SORT DATA = MERGED;
BY cow_id time ;
RUN;
Ignore the 'DATA new' step. That was just to simulate some data so that I could demonstrate the technique. Use the code that begins with the long 'divider' comment /*************************************/
this part takes a long time and then following by not responding prosses, and I can't reach the log file of the software or any tool of the software, which forces me to end the program.
proc nlin data = new outest=ESTIMATES;
parms A = 15 B = 0.19 C = -0.0012 ;
bounds A B C > 0;
by cow_id pr;
model TEST_DAY_MILK_KG = A * Time **b * exp(-C*Time);
output out = Fit predicted = Pred ;
run;
This part:
select * from persistency;
run;
sends all data from the dataset to the output window, which can take very long in a client/server environment and cause the client (EG) to hang/crash in some cases because of running out of memory.
Hello
You have defined c =-0.0012
Again in the bounds A B C>0
These two statements are contradictory.
Remove the negative sign. Take C=0.0012
There is already a negative sign in the exp part.
Please try this and let us know
Please don’t forget to put
endsas; before proc sql.
Pl also take care of the suggestion by Kurt.
You can simplify your model by linearizing it.
Unfortunately, nothing change. I'm still having the same issue.
I will try much more to see what will happen.
thank you so much for your time that you spent to help me out with that problem, and I will keep in touch with you if I found any solution
Regards
Further to what I have said,. you should reconsider your model.
TEST_DAY_MILK_KG = A * Time **b * exp(-C*Time);
At Time=0 (start)
Time**b=0
exp(-C*Time) =exp(0) =1
Thus TEST_DAY_MILK_KG = A *(0)*(1) =0
At Time = Infinity (After a large time)
Time **b will be very large or infinity
exp(-C * Time)=1/exp(C*Time)=1/infinity =0
Thus TEST_DAY_MILK_KG = A *(Infinity)*0 = 0
Though I understand modeling, I do not know about your process.
> What do you think the best way to solve this problem?
I think the best way to solve this equation is to transform it to a LINEAR system. If
M = A * t**B * exp(-C*t)
then
log(M) = log(A) + B*log(t) - C*t;
Therefore, in the DATA step define
logM = log(TEST_DAY_MILK_KG);
logT = log(Time);
Then use PROC REG to solve the linear system. If desired, you can transform the parameter estimates and predictions back to their original scale, as follows:
/* simulate data */
data new;
do cow_id=1,2; pr=1;
do Time = 1 to 500 by 7;
logM = log(15) + 0.2*log(Time) - 0.0012*Time + rand("Normal",0,0.05);
TEST_DAY_MILK_KG = exp(logM);
output;
end;
end;
keep TEST_DAY_MILK_KG Time cow_id pr;
run;
/* end simulation */
/************************************/
/* take LOG transform of the data */
data LOG;
set new;
logM = log(TEST_DAY_MILK_KG);
logT = log(Time);
run;
/* linear regression */
proc reg data=LOG noprint outest=LogESTIMATES;
by cow_id pr;
model logM = logT Time;
output out = LogFit predicted = LogPred;
run;
/* transform estimates to original scale */
data Estimates;
set LogESTIMATES;
A = exp(Intercept);
B = logT;
C = -Time;
run;
/* transform predictions to original scale */
data Fit;
set LogFit;
Pred = exp(LogPred);
run;
/* graph data; overlay fit */
proc sgplot data=Fit;
where cow_id=1 and pr=1;
scatter x=Time y=TEST_DAY_MILK_KG;
series x=Time y=Pred;
run;
Hi
thanks for your help.
But I have more questions about this part of the code. I want to keep all cow_id data, this one just presented two cow_ids and I have a lot in my dataset.
do cow_id=1,2; pr=1;
also, I want to keep all variables that presented in data new.
keep TEST_DAY_MILK_KG Time cow_id pr;
So, how it could look like this code?
thanks again
data new;
do cow_id=1,2; pr=1;
do Time = 1 to 500 by 7;
logM = log(15) + 0.2*log(Time) - 0.0012*Time + rand("Normal",0,0.05);
TEST_DAY_MILK_KG = exp(logM);
output;
end;
end;
keep TEST_DAY_MILK_KG Time cow_id pr;
run;
Ignore the 'DATA new' step. That was just to simulate some data so that I could demonstrate the technique. Use the code that begins with the long 'divider' comment /*************************************/
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.