Dear community,
I am a foundational level user of SAS and got my syntax from a colleague. I have tried running my model several times but while some treatment converged, others did not converge. .I did not modify any of the constraints as given to me by my friend and am not sure if they need to be modified for my data set. I will appreciate any help and interpretation that can help me make progress. Here is the syntax and data i tried analysing:
DATA ACC;
Input TIME IVGP SPECIES$;
Cards;
2 10.49293325 C1
2 8.32876575 C1
2 5.77111325 C11
2 4.78740075 C11
2 6.16459825 CN2
2 5.37762825 CN2
2 4.78740075 CN21
2 3.60694575 CN21
2 1.83626325 CN3
2 1.04687 CN3
2 3.88116175 CN31
2 3.150835925 CN31
2 8.32876575 U2
2 9.641 U2
2 2.42649075 U21
2 2.62323325 U21
2 9.31247825 U3
2 7.14831075 U3
2 4.19717325 U31
2 6.55808325 U31
2 9.50922075 U1
2 11.87013075 U1
2 7.854721 U11
2 9.0148674 U11
2 12.26361575 CN1
2 10.29619075 CN1
2 5.565580825 CN11
2 5.32876575 CN11
4 15.72311325 C1
4 13.19803575 C1
4 9.83712575 C11
4 8.26318575 C11
4 8.85341325 CN2
4 6.88598825 CN2
4 5.70553325 CN21
4 6.68924575 CN21
4 3.48732 CN3
4 2.34672 CN3
4 6.84022825 CN31
4 6.69674325 CN31
4 10.62409575 U2
4 11.303625 U2
4 5.70553325 U21
4 5.90227575 U21
4 13.57523325 U3
4 12.39477825 U3
4 7.08273075 U31
4 11.41106575 U31
4 16.34591575 U1
4 18.08402325 U1
4 13.6521897 U11
4 14.9912487 U11
4 19.08402325 CN1
4 17.11659825 CN1
4 10.788075 CN11
4 8.16546075 CN11
8 24.95372325 C1
8 22.06773575 C1
8 15.32962825 C11
8 14.77197575 C11
8 15.54265825 CN2
8 12.00129325 CN2
8 9.06644325 CN21
8 11.00129325 CN21
8 8.1297124 CN3
8 7.6971254 CN3
8 9.91856325 CN31
8 8.75439575 CN31
8 18.90356825 U2
8 20.30325 U2
8 11.80455075 U21
8 12.59152075 U21
8 19.87099325 U3
8 19.47750825 U3
8 14.16546075 U31
8 18.49379575 U31
8 23.05144825 U1
8 24.36349575 U1
8 19.87432 U11
8 20.1467912 U11
8 24.19932825 CN1
8 24.00258575 CN1
8 13.45992825 CN11
8 14.28076575 CN11
12 36.5945365 C1
12 35.741124 C1
12 25.051879 C11
12 24.658394 C11
12 26.625819 CN2
12 20.723544 CN2
12 17.640814 CN21
12 17.510514 CN21
12 19.7398315 CN3
12 15.0180115 CN3
12 13.247329 CN31
12 10.6896765 CN31
12 32.773699 U2
12 33.14078 U2
12 23.871424 U21
12 24.264909 U21
12 29.9704415 U3
12 31.347639 U3
12 27.412789 U31
12 32.3313515 U31
12 35.7248365 U1
12 36.4140815 U1
12 29.051879 U11
12 32.2662015 U11
12 36.856429 CN1
12 35.675974 CN1
12 24.788694 CN11
12 23.347639 CN11
24 75.5645465 C1
24 73.500689 C1
24 60.859014 C11
24 60.072044 C11
24 58.6948465 CN2
24 50.234919 CN2
24 48.4466565 CN21
24 49.5143915 CN21
24 50.8251465 CN3
24 46.1033265 CN3
24 45.2837815 CN31
24 43.971734 CN31
24 61.6121165 U2
24 61.198135 U2
24 64.5971215 U21
24 64.006894 U21
24 64.941744 U3
24 63.465529 U3
24 70.696139 U31
24 72.8603065 U31
24 72.335229 U1
24 72.598414 U1
24 64.285074 U11
24 66.2863665 U11
24 74.630989 CN1
24 72.8603065 CN1
24 58.728714 CN11
24 58.267494 CN11
48 144.884779 C1
48 143.2118215 C1
48 117.766889 C11
48 115.832039 C11
48 90.092639 CN2
48 90.5825565 CN2
48 88.8281615 CN21
48 90.666579 CN21
48 84.205359 CN3
48 90.9610465 CN3
48 119.127799 CN31
48 118.9962065 CN31
48 92.844449 U2
48 90.8641775 U2
48 149.162959 U21
48 151.328419 U21
48 131.2819714 U3
48 130.7126745 U3
48 130.177954 U31
48 129.9812115 U31
48 121.521284 U1
48 125.6528765 U1
48 139.2144065 U11
48 141.799464 U11
48 145.017664 CN1
48 143.0789365 CN1
48 122.112804 CN11
48 120.929764 CN11
;
PROC SORT DATA = WORK.ACC;
BY SPECIES;
RUN;
PROC PLOT DATA = WORK.ACC;
BY SPECIES;
PLOT IVGP*TIME;
RUN;
PROC NLIN BEST = 9; BY SPECIES;
PARMS A = 9 TO 15 BY 1
B = 39 TO 44 BY 1
C = 0.03 TO 0.05 BY 0.005;
MODEL IVGP = A + B*(1-EXP(-C*TIME));
OUTPUT OUT = WORK.ACC PARMS = ABC;
RUN;
DATA WORK.ACC;
SET WORK.ACC;
PD = A + B;
ED = A + B*C/(C+0.05);
RUN;
PROC MEANS MEAN NOPRINT DATA = WORK.ACC;
BY SPECIES;
VAR A B C PD ED;
OUTPUT OUT = WORK.ACC MEAN = A B C PD ED;
RUN;
PROC PRINT DATA = WORK.ACC;
RUN;
"while some treatment converged, others did not converge" - What are the treatments in your data? You have two obs for each SPECIES and TIME, are they replicates?
Thank you for your response. My treatments were different feed stuffs designated as species and codenamed as : C1, C11, CN1, CN11, CN2, CN21, Cn3, CN31, U1,U11,U2,U21,U3,U31 (14 treatments). Data was collected as a cummulative methane gas production across times 2 h, 4h,8h,12h,24 and 48Hours after incubation of species. I want to model the effective gas produced as well as the a, b, c values for each species. I do hope my answer relates o what you asked?
Yes, it's clear what you want. I'm going to move this to the Statistical Procedures forum where more statistical experts gather.
Why model cumulative production? I tried to model the production rate itself assuming it varies linearly with time. The result is a much simpler model that fits the data with no convergence problem.
DATA ACC;
Input TIME IVGP SPECIES$;
rep = mod(_n_, 2);
Cards;
2 10.49293325 C1
2 8.32876575 C1
2 5.77111325 C11
2 4.78740075 C11
2 6.16459825 CN2
2 5.37762825 CN2
2 4.78740075 CN21
2 3.60694575 CN21
2 1.83626325 CN3
2 1.04687 CN3
2 3.88116175 CN31
2 3.150835925 CN31
2 8.32876575 U2
2 9.641 U2
2 2.42649075 U21
2 2.62323325 U21
2 9.31247825 U3
2 7.14831075 U3
2 4.19717325 U31
2 6.55808325 U31
2 9.50922075 U1
2 11.87013075 U1
2 7.854721 U11
2 9.0148674 U11
2 12.26361575 CN1
2 10.29619075 CN1
2 5.565580825 CN11
2 5.32876575 CN11
4 15.72311325 C1
4 13.19803575 C1
4 9.83712575 C11
4 8.26318575 C11
4 8.85341325 CN2
4 6.88598825 CN2
4 5.70553325 CN21
4 6.68924575 CN21
4 3.48732 CN3
4 2.34672 CN3
4 6.84022825 CN31
4 6.69674325 CN31
4 10.62409575 U2
4 11.303625 U2
4 5.70553325 U21
4 5.90227575 U21
4 13.57523325 U3
4 12.39477825 U3
4 7.08273075 U31
4 11.41106575 U31
4 16.34591575 U1
4 18.08402325 U1
4 13.6521897 U11
4 14.9912487 U11
4 19.08402325 CN1
4 17.11659825 CN1
4 10.788075 CN11
4 8.16546075 CN11
8 24.95372325 C1
8 22.06773575 C1
8 15.32962825 C11
8 14.77197575 C11
8 15.54265825 CN2
8 12.00129325 CN2
8 9.06644325 CN21
8 11.00129325 CN21
8 8.1297124 CN3
8 7.6971254 CN3
8 9.91856325 CN31
8 8.75439575 CN31
8 18.90356825 U2
8 20.30325 U2
8 11.80455075 U21
8 12.59152075 U21
8 19.87099325 U3
8 19.47750825 U3
8 14.16546075 U31
8 18.49379575 U31
8 23.05144825 U1
8 24.36349575 U1
8 19.87432 U11
8 20.1467912 U11
8 24.19932825 CN1
8 24.00258575 CN1
8 13.45992825 CN11
8 14.28076575 CN11
12 36.5945365 C1
12 35.741124 C1
12 25.051879 C11
12 24.658394 C11
12 26.625819 CN2
12 20.723544 CN2
12 17.640814 CN21
12 17.510514 CN21
12 19.7398315 CN3
12 15.0180115 CN3
12 13.247329 CN31
12 10.6896765 CN31
12 32.773699 U2
12 33.14078 U2
12 23.871424 U21
12 24.264909 U21
12 29.9704415 U3
12 31.347639 U3
12 27.412789 U31
12 32.3313515 U31
12 35.7248365 U1
12 36.4140815 U1
12 29.051879 U11
12 32.2662015 U11
12 36.856429 CN1
12 35.675974 CN1
12 24.788694 CN11
12 23.347639 CN11
24 75.5645465 C1
24 73.500689 C1
24 60.859014 C11
24 60.072044 C11
24 58.6948465 CN2
24 50.234919 CN2
24 48.4466565 CN21
24 49.5143915 CN21
24 50.8251465 CN3
24 46.1033265 CN3
24 45.2837815 CN31
24 43.971734 CN31
24 61.6121165 U2
24 61.198135 U2
24 64.5971215 U21
24 64.006894 U21
24 64.941744 U3
24 63.465529 U3
24 70.696139 U31
24 72.8603065 U31
24 72.335229 U1
24 72.598414 U1
24 64.285074 U11
24 66.2863665 U11
24 74.630989 CN1
24 72.8603065 CN1
24 58.728714 CN11
24 58.267494 CN11
48 144.884779 C1
48 143.2118215 C1
48 117.766889 C11
48 115.832039 C11
48 90.092639 CN2
48 90.5825565 CN2
48 88.8281615 CN21
48 90.666579 CN21
48 84.205359 CN3
48 90.9610465 CN3
48 119.127799 CN31
48 118.9962065 CN31
48 92.844449 U2
48 90.8641775 U2
48 149.162959 U21
48 151.328419 U21
48 131.2819714 U3
48 130.7126745 U3
48 130.177954 U31
48 129.9812115 U31
48 121.521284 U1
48 125.6528765 U1
48 139.2144065 U11
48 141.799464 U11
48 145.017664 CN1
48 143.0789365 CN1
48 122.112804 CN11
48 120.929764 CN11
;
proc sort data=acc; by species rep time; run;
/* Calculate the methane production rates */
data acc2;
set acc; by species rep;
prevTime = lag(time);
prevIVGP = lag(IVGP);
if first.rep then call missing(prevTime, prevIVGP);
interval = time - coalesce(prevTime, 0);
midTime = (time + coalesce(prevTime, 0)) / 2;
prod = IVGP - coalesce(prevIVGP, 0);
rate = prod / interval;
run;
/* Add some points to get predicted values */
data acc3;
set acc2; by species;
output;
call missing(rate);
if last.species then do;
do midTime = 0 to 48 by 0.2;
output;
end;
end;
run;
/* Model rates as linear trends for each species */
proc glm data=acc3 plots=none;
class species;
model rate = species|midtime / solution;
output out=accOut predicted=predRate;
run;
quit;
/* Integrate predicted rates to get predicted cumulative production */
data acc4;
set accOut; by species;
if missing(rate) then do;
if midTime=0 then predIVGP = 0;
else predIVGP + predRate * 0.2;
end;
else call missing(predIVGP);
run;
/* Plot the fit */
ods graphics / width=900 height=1400;
proc sgpanel data=acc4;
panelby species / columns=3 onepanel;
scatter y=IVGP x=time / group=rep;
series y=predIVGP x=midTime;
run;
Sir,
Thanks for your response and assistance. Howeevr, my effort to use the new syntax you gave me showed nice curves of the gas production but the output values looks disjointed or maybe i don seem to be able to interprete it. I was hoping to get results showing ,e "a" "b" and "C" values which will help me calculate Efective gas produced using the model.
A few comments:
1) This code looks like it might have been written a long time ago. It uses ALL CAPS and PROC PLOT, which was the old "line printer" plot routine
2) For more a modern graph, put the following call after the PROC SORT call (you can get rid of PROC PLOT):
proc sgplot data=WORK.ACC;
scatter y = IVGP x=TIME / group=species;
loess y =IVGP x=time / group=species; /* estimate curves */
run;
3) To get the correct output in a data set from PROC NLIN, put spaces between the parameters on the OUTPUT statement PARMS= option. Also, don't overwrite your input data. Use the following statement to create a new data set called NlinOut.
OUTPUT OUT = NlinOut PARMS = A B C;
4) The rest of the code can be modified to use the new data and not overwrite the input data:
DATA NlinOut;
SET NlinOut;
PD = A + B;
ED = A + B*C/(C+0.05);
RUN;
PROC MEANS MEAN NOPRINT DATA = NlinOut;
BY SPECIES;
VAR A B C PD ED;
OUTPUT OUT =MeanOut MEAN = A B C PD ED;
RUN;
PROC PRINT DATA =MeanOut;
RUN;
I have not addressed the lack of convergence, but the previous changes are important just so that someone can make sense of what analysis you are trying to conduct.
Are you trying to find a best A,B,C which are at some special range to best fit the function : IVGP = A + B*(1-EXP(-C*TIME)) ?
You can use other Nonlinear Programming Function like NLPTR() to get it . OR Genetic Algorithm to get it.
Check Rick's blog: (how to find out the best Mu,Sigma estimator for Normal distribution)
http://blogs.sas.com/content/iml/2011/10/12/maximum-likelihood-estimation-in-sasiml.html
Thanks for your response. I want to get the best A, B, C values to model IVGP. using a non linear regression model. Can you help me please.?
Data ACC1;
Input Time SPECIES$ IVGP;
Cards;
2 C1 10.49293
2 C1 8.328766
2 C11 5.771113
2 C11 4.787401
2 CN2 6.164598
2 CN2 5.377628
2 CN21 4.787401
2 CN21 3.606946
2 CN3 1.836263
2 CN3 1.04687
2 CN31 3.881162
2 CN31 3.150836
2 U2 8.328766
2 U2 9.641
2 U21 2.426491
2 U21 2.623233
2 U3 9.312478
2 U3 7.148311
2 U31 4.197173
2 U31 6.558083
2 U1 9.509221
2 U1 11.87013
2 U11 7.854721
2 U11 9.014867
2 CN1 12.26362
2 CN1 10.29619
2 CN11 5.565581
2 CN11 5.328766
4 C1 15.72311
4 C1 13.19804
4 C11 9.837126
4 C11 8.263186
4 CN2 8.853413
4 CN2 6.885988
4 CN21 5.705533
4 CN21 6.689246
4 CN3 3.48732
4 CN3 2.34672
4 CN31 6.840228
4 CN31 6.696743
4 U2 10.6241
4 U2 11.30363
4 U21 5.705533
4 U21 5.902276
4 U3 13.57523
4 U3 12.39478
4 U31 7.082731
4 U31 11.41107
4 U1 16.34592
4 U1 18.08402
4 U11 13.65219
4 U11 14.99125
4 CN1 19.08402
4 CN1 17.1166
4 CN11 10.78808
4 CN11 8.165461
8 C1 24.95372
8 C1 22.06774
8 C11 15.32963
8 C11 14.77198
8 CN2 15.54266
8 CN2 12.00129
8 CN21 9.066443
8 CN21 11.00129
8 CN3 8.129712
8 CN3 7.697125
8 CN31 9.918563
8 CN31 8.754396
8 U2 18.90357
8 U2 20.30325
8 U21 11.80455
8 U21 12.59152
8 U3 19.87099
8 U3 19.47751
8 U31 14.16546
8 U31 18.4938
8 U1 23.05145
8 U1 24.3635
8 U11 19.87432
8 U11 20.14679
8 CN1 24.19933
8 CN1 24.00259
8 CN11 13.45993
8 CN11 14.28077
12 C1 36.59454
12 C1 35.74112
12 C11 25.05188
12 C11 24.65839
12 CN2 26.62582
12 CN2 20.72354
12 CN21 17.64081
12 CN21 17.51051
12 CN3 19.73983
12 CN3 15.01801
12 CN31 13.24733
12 CN31 10.68968
12 U2 32.7737
12 U2 33.14078
12 U21 23.87142
12 U21 24.26491
12 U3 29.97044
12 U3 31.34764
12 U31 27.41279
12 U31 32.33135
12 U1 35.72484
12 U1 36.41408
12 U11 29.05188
12 U11 32.2662
12 CN1 36.85643
12 CN1 35.67597
12 CN11 24.78869
12 CN11 23.34764
24 C1 75.56455
24 C1 73.50069
24 C11 60.85901
24 C11 60.07204
24 CN2 58.69485
24 CN2 50.23492
24 CN21 48.44666
24 CN21 49.51439
24 CN3 50.82515
24 CN3 46.10333
24 CN31 45.28378
24 CN31 43.97173
24 U2 61.61212
24 U2 61.19814
24 U21 64.59712
24 U21 64.00689
24 U3 64.94174
24 U3 63.46553
24 U31 70.69614
24 U31 72.86031
24 U1 72.33523
24 U1 72.59841
24 U11 64.28507
24 U11 66.28637
24 CN1 74.63099
24 CN1 72.86031
24 CN11 58.72871
24 CN11 58.26749
48 C1 144.8848
48 C1 143.2118
48 C11 117.7669
48 C11 115.832
48 CN2 90.09264
48 CN2 90.58256
48 CN21 88.82816
48 CN21 90.66658
48 CN3 84.20536
48 CN3 90.96105
48 CN31 119.1278
48 CN31 118.9962
48 U2 92.84445
48 U2 90.86418
48 U21 149.163
48 U21 151.3284
48 U3 131.282
48 U3 130.7127
48 U31 130.178
48 U31 129.9812
48 U1 121.5213
48 U1 125.6529
48 U11 139.2144
48 U11 141.7995
48 CN1 145.0177
48 CN1 143.0789
48 CN11 122.1128
48 CN11 120.9298
;
PROC SORT DATA = WORK.ACC1;
BY SPECIES;
RUN;
proc sgplot data=WORK.ACC;
scatter y = IVGP x=TIME / group=species;
loess y =IVGP x=time / group=species; /* estimate curves */
run;
PROC NLIN BEST = 9; BY SPECIES;
PARMS A = 9 TO 15 BY 1
B = 39 TO 44 BY 1
C = 0.03 TO 0.05 BY 0.005;
MODEL IVGP = A + B*(1-EXP(-C*TIME));
OUTPUT OUT = NlinOut PARMS = A B C;
RUN;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.