BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Barkamih
Pyrite | Level 9

Hi Guys 

 

I have an issue when I'm running this code of my dataset.  which sas giving (Not responding) after a few minutes by running this code, while when I was running the same code for small dataset had great results.  Dataset has 218006 observations and 28 variables. 

 

What do you think the best way to solve this problem?

PROC SORT DATA = NEW;
BY  cow_id pr;
RUN;

proc nlin data = new outest=ESTIMATES;
parms A = 15 B = 0.19 C = -0.0012 ;
bounds  A B C > 0; 
by cow_id pr;
model  TEST_DAY_MILK_KG = A * Time **b * exp(-C*Time);
output out = Fit predicted = Pred ;
run;


proc sql;
create table persistency as
select 
cow_id,pr,
A, B, C, 
-(B+1)*LOG(C) as P,
A * (B/C)**(B) * exp(-B) as peakYield,
(B/C) as tmdays
from ESTIMATES
where _TYPE_ = "FINAL";
select * from persistency;
run;
quit;

data merged;
merge new persistency;
by cow_id pr;
run;

PROC SORT DATA = MERGED;
BY cow_id time ;
RUN;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Ignore the 'DATA new' step. That was just to simulate some data so that I could demonstrate the technique. Use the code that begins with the long 'divider' comment /*************************************/

View solution in original post

9 REPLIES 9
thesasuser
Pyrite | Level 9
Please run each of the procs / data step one by one.
You can identify the error.
Then post the logs
Barkamih
Pyrite | Level 9

this part takes a long time and then following by not responding prosses, and I can't reach the log file of the software or any tool of the software, which forces me to end the program. 

proc nlin data = new outest=ESTIMATES;
parms A = 15 B = 0.19 C = -0.0012 ;
bounds  A B C > 0; 
by cow_id pr;
model  TEST_DAY_MILK_KG = A * Time **b * exp(-C*Time);
output out = Fit predicted = Pred ;
run;

 

Kurt_Bremser
Super User

This part:

select * from persistency;
run;

sends all data from the dataset to the output window, which can take very long in a client/server environment and cause the client (EG) to hang/crash in some cases because of running out of memory.

thesasuser
Pyrite | Level 9

Hello
You have defined c =-0.0012
Again in the bounds A B C>0
These two statements are contradictory.
Remove the negative sign. Take C=0.0012
There is already a negative sign in the exp part.
Please try this and let us know
Please don’t forget to put
endsas; before proc sql.
Pl also take care of the suggestion by Kurt. 

You can simplify your model by linearizing it.

 

Barkamih
Pyrite | Level 9

Unfortunately, nothing change. I'm still having the same issue.

I will try much more to see what will happen.

thank you so much for your time that you spent to help me out with that problem, and I will keep in touch with you if I found any solution  

Regards 

thesasuser
Pyrite | Level 9

Further to what I have said,. you should reconsider your model.
TEST_DAY_MILK_KG = A * Time **b * exp(-C*Time);

At Time=0 (start) 

           Time**b=0

            exp(-C*Time) =exp(0) =1

 

Thus TEST_DAY_MILK_KG = A *(0)*(1) =0

At Time = Infinity   (After a large time)

          Time **b will be very large or infinity

          exp(-C * Time)=1/exp(C*Time)=1/infinity =0          

Thus TEST_DAY_MILK_KG = A *(Infinity)*0 = 0  

Though I understand modeling, I do not know about your process.  

 

Rick_SAS
SAS Super FREQ

> What do you think the best way to solve this problem?

I think the best way to solve this equation is to transform it to a LINEAR  system. If

M = A * t**B * exp(-C*t)

then 

log(M) = log(A) + B*log(t) - C*t;

 

Therefore, in the DATA step define

logM = log(TEST_DAY_MILK_KG);

logT = log(Time);

 

Then use PROC REG to solve the linear system. If desired, you can transform the parameter estimates and predictions back to their original scale, as follows:

 

/* simulate data */
data new;
do cow_id=1,2; pr=1;
   do Time = 1 to 500 by 7;
      logM = log(15) + 0.2*log(Time) - 0.0012*Time + rand("Normal",0,0.05);
      TEST_DAY_MILK_KG = exp(logM);
      output;
   end;
end;
keep TEST_DAY_MILK_KG Time cow_id pr;
run;
/* end simulation */
/************************************/ /* take LOG transform of the data */ data LOG; set new; logM = log(TEST_DAY_MILK_KG); logT = log(Time); run; /* linear regression */ proc reg data=LOG noprint outest=LogESTIMATES; by cow_id pr; model logM = logT Time; output out = LogFit predicted = LogPred; run; /* transform estimates to original scale */ data Estimates; set LogESTIMATES; A = exp(Intercept); B = logT; C = -Time; run; /* transform predictions to original scale */ data Fit; set LogFit; Pred = exp(LogPred); run; /* graph data; overlay fit */ proc sgplot data=Fit; where cow_id=1 and pr=1; scatter x=Time y=TEST_DAY_MILK_KG; series x=Time y=Pred; run;
Barkamih
Pyrite | Level 9

Hi 

thanks for your help.

But I have more questions about this part of the code. I want to keep all cow_id data, this one just presented two cow_ids and I have a lot in my dataset.

do cow_id=1,2; pr=1;

also, I want to keep all variables that presented in data new.

keep TEST_DAY_MILK_KG Time cow_id pr;

So, how it could look like this code?

thanks again

data new;
do cow_id=1,2; pr=1;
   do Time = 1 to 500 by 7;
      logM = log(15) + 0.2*log(Time) - 0.0012*Time + rand("Normal",0,0.05);
      TEST_DAY_MILK_KG = exp(logM);
      output;
   end;
end;
keep TEST_DAY_MILK_KG Time cow_id pr;
run;

 

Rick_SAS
SAS Super FREQ

Ignore the 'DATA new' step. That was just to simulate some data so that I could demonstrate the technique. Use the code that begins with the long 'divider' comment /*************************************/

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 9982 views
  • 5 likes
  • 4 in conversation