Hi,
I turn to SAS University Edition to create a Geographically Weighted Negative Binomial regression (GWNBR) model after realising there is currently no available R package on it. I went through the paper by Silva and Rodrigues (2016) and obtained the complete SAS code for GWNBR from Silva's post. I imported the .csv data into SAS studio successfully, but was unable to complete the calculation of optimal bandwidth using the Golden Section Search codes provided by Silva, as nothing is displayed in the RESULTS section. I checked the LOG section and no lines are highlighted in blue (which shows errors in codes), but I notice two sentences which I do not understand their meaning:
1. OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; [first line of LOG]
2. NOTE: Unable to open parameter catalog: SASUSER.PARMS.PARMS.SLIST in update mode. Temporary parameter values will be saved to WORK.PARMS.PARMS.SLIST. [last line of LOG]
A few questions regarding the situation above:
1. Did I miss out something in the Golden Section Search codes? If not, how can I obtain the result, i.e., the value of optimal bandwidth?
2. The .csv data contains 28 thousand data points with nearly 20 independent variables. Is 8GB RAM able to handle the calculation of bandwidth and NBGWR model?
I appreciate any help to guide me through.
***Full code is attached below
/*********************************************************/
/******************** IMPORT DATA ************************/
/*********************************************************/
%let path = /folders/myfolders/csv;
proc import datafile = "&path/SAS 3km.csv"
out = vdata
dbms = csv replace;
/*********************************************************/
/*************** CREATING PARAMETERS ESTIMATES************/
/*********************************************************/
%macro beta(par);
proc iml;
use _beta_;
read all into b;
close _beta_;
n=nrow(b);
npar=&par+1;
%do i=0 %to ∥
b&i=j(1,8,0);
nome={"id" "geocod" "x" "y" "b" "sebi" "tstat" "probtstat"};
create b&i from b&i[colname=nome];
do i=1 to (n/npar); * from i=1 to N;
b&i[1,]=b[(i-1)*npar+&i+1,];
append from b&i;
end;
%end;
quit;
%mend beta;
/*********************************************************/
/***** PARAMETERS FOR STATIONARITY TEST ******************/
/*********************************************************/
%macro vk(par);
proc iml;
use _beta_;
read all into b;
close _beta_;
use _alpha_;
read all var {alphai} into alpha;
close _alpha_;
n=nrow(b);
npar=&par+1;
n=n/npar;
vk=0;
%do i=0 %to ∥
b&i=j(n,1,0);
do i=1 to n;
b&i[i,1]=b[(i-1)*npar+&i+1,5];
end;
vk&i=sum((b&i - b&i[:] )##2)/n ;
vk=vk||vk&i;
%end;
vka= sum((alpha - alpha[:] )##2)/n ;
vk=vk||vka;
idx = setdif(1:(npar+2),1);
vk = vk[,idx];
create vk from vk;
append from vk;
quit;
%mend vk;
/*********************************************************/
/***** PERMUTATION FOR STATIONARITY TEST *****************/
/*********************************************************/
%macro perm(data=vdata,geocod=geocod,x=xcoord,y=ycoord);
proc iml;
use &data;
read all var{&geocod &x &y} into tab;
close &data;
n=nrow(tab);
u = 1:n;
call randgen(u, "Uniform");
_u_=rank(u);
create perm var{_u_};
append;
quit;
data perm; merge perm &data(drop= &geocod &x y) ; run;
proc sort data=perm; by _u_; run;
data perm; merge perm &data(keep=&geocod &x &y); run;
%mend perm;
/*********************************************************/
/***************** GOLDEN SECTION SEARCH *****************/
/*********************************************************/
%macro golden(
data=vdata,
y=FC,
x=RAIN MAJOR_D MINOR_D RIVER_P MAJOR_P MINOR_P SMALL_P PG_P LC_AG LC_SH ELE SLP POPDEN CATTLE GOAT GDP MAI,
lat=xcoord,
long=ycoord,
method=adaptive1,
type=cv,
gwr=local,
out=band);
proc iml;
use &data;
read all var {&y} into y;
read all var {&x} into x;
read all var{&long &lat} into COORD;
n=nrow(y);
%if &offset= %then %do; offset=j(n,1,0); %end;
%else %do; read all var {&offset} into offset; %end;
close &data;
x=j(n,1,1)||x;
method= "&method"; *fixed, adaptive1 ou adaptiven;
type="&type"; *aic , cv ou dev ;
gwr="&gwr"; *global, local, poisson;
print method type gwr;
start dist(coord,n);
d=j(1,3,0);
nome={"idi" "idj" "d"};
create _dist_ from d[colname=nome];
do i=1 to n;
do j=i+1 to n;
if abs(coord[,1])<180 then do;
dif=abs(COORD[i,1]-COORD[j,1]);
raio=arcos(-1)/180;
ang=sin(COORD[i,2]*raio)*sin(COORD[j,2]*raio)+cos(COORD[i,2]*raio)*cos(COORD[j,2]*raio)*cos(dif*raio);
arco=arcos(ang);
d[1]=i;
d[2]=j;
d[3]=arco*6371 /*Earth's Radius = 6371 (approximately)*/;
append from d;
end;
else do;
d[1]=i;
d[2]=j;
d[3]=sqrt((COORD[i,1]-COORD[j,1])**2+(COORD[i,2]-COORD[j,2])**2);
append from d;
end;
end;
end;
close _dist_;
finish dist;
run dist(coord,n);
use _dist_;
read all into d;
maxd=int(max(d[,3])+1);
free d;
close _dist_;
if method= "adaptive1" then do;
h0= 5 ; h3= n;
end;
else if method= "adaptiven" | method= "fixed" then do;
h0= 0 ; h3= maxd;
end;
r=0.61803399; c=1-r;
if method= "adaptive1" then tol=0.9; else tol=0.1;
h1=h0+(1-r)*(h3-h0);
h2=h0+r*(h3-h0);
print h0 h1 h2 h3;
start cv(h) global(method, n, coord, x, y, type, maxd, gwr, offset);
alphaii= j(n,2,0);
yhat=j(n,1,0);
S=j(n,n,0);
if gwr="global" then do;
ym=sum(y)/nrow(y);
u=(y+ym)/2;
n=log(u);
par=1; ddpar=1; j=0; aux2=0;
do while (abs(ddpar)>0.00001);
aux1=0; dpar=1; parold=par;
do while (abs(dpar)>0.001);
aux1=aux1+1;
if par<0 then do;
par=0.00001;
end;
par=choose(par<1E-10,1E-10,par);
g=sum(digamma(par+y)-digamma(par)+log(par)+1-log(par+u)-(par+y)/(par+u));
hess=sum(trigamma(par+y)-trigamma(par)+1/par-2/(par+u)+(y+par)/((par+u)#(par+u)));
hess=choose(abs(hess)<1E-23,sign(hess)*1E-23,hess);
hess=choose(hess=0,1E-23,hess);
par0=par;
par=par0-inv(hess)*g;
if aux1>50 & par>1E5 then do;
dpar= 0.0001;
aux2=aux2+1;
if aux2=1 then par=2 ;
else if aux2=2 then par=1E5;
else if aux2=3 then par=0.0001;
end;
else dpar=par-par0;
end;
a=1/par; dev=0; ddev=1; i=0;
do while (abs(ddev)>0.00001);
i=i+1;
w=(u/(1+a*u))+(y-u)#(a*u/(1+2*a*u+a*a*u#u));
z=n+(y-u)/(w#(1+a*u)) - offset;
b=inv((x#w)`*x)*(x#w)`*z;
n=x*b + offset;
u=exp(n);
olddev=dev;
tt=y/u;
tt=choose(tt=0,1E-10,tt);
dev=2*sum(y#log(tt)-(y+1/a)#log((1+a*y)/(1+a*u)));
ddev=dev-olddev;
end;
if aux2>4 then ddpar=1E-9;
else ddpar=par-parold;
end;
alpha=a;
end;
n=nrow(y);
aux2=0;
do i=1 to n;
d=j(1,3,0);
dist=d;
do j=1 to n;
if abs(coord[,1])<180 then do;
dif=abs(COORD[i,1]-COORD[j,1]);
raio=arcos(-1)/180;
ang=sin(COORD[i,2]*raio)*sin(COORD[j,2]*raio)+cos(COORD[i,2]*raio)*cos(COORD[j,2]*raio)*cos(dif*raio);
if i=j then arco=0;
else arco=arcos(ang);
d1=arco*6371;
end;
else d1=sqrt((COORD[i,1]-COORD[j,1])**2+(COORD[i,2]-COORD[j,2])**2);
d[1]=i; d[2]=j; d[3]=d1;
if j=1 then dist=d;
else dist=dist//d;
end;
u=nrow(dist);
w=j(u,1,0);
if method= "fixed" then do;
if type="cv" then do;
do jj=1 to u;
if dist[jj,3]<=maxd*0.8 & dist[jj,3]^=0 then w[jj]=exp(-0.5*(dist[jj,3]/h)**2);
else w[jj]= 0;
end;
end;
else do;
do jj=1 to u;
if dist[jj,3]<=maxd*0.8 then w[jj]=exp(-0.5*(dist[jj,3]/h)**2);
else w[jj]= 0;
end;
end;
end;
else if method= "adaptiven" then do;
if type="cv" then do;
do jj=1 to u;
if dist[jj,3]<=h & dist[jj,3]^=0 then w[jj]=(1-(dist[jj,3]/h)**2)**2;
else w[jj]= 0;
end;
end;
else do;
do jj=1 to u;
if dist[jj,3]<=h then w[jj]=(1-(dist[jj,3]/h)**2)**2;
else w[jj]= 0;
end;
end;
end;
else if method= "adaptive1" then do;
call sort(dist,{3});
dist=dist||(1:n)`;
w=j(n,2,0);
hn=dist[h,3];
if type="cv" then do;
do jj=1 to n;
if dist[jj,4]<= h & dist[jj,3]^=0 then w[jj,1]=(1-(dist[jj,3]/hn)**2)**2;
else w[jj,1]=0;
w[jj,2]=dist[jj,2];
end;
end;
else do;
do jj=1 to n;
if dist[jj,4]<=h then w[jj,1]=(1-(dist[jj,3]/hn)**2)**2;
else w[jj,1]=0;
w[jj,2]=dist[jj,2];
end;
end;
call sort(w,{2});
end;
wi=w[,1];
ym=sum(y)/nrow(y);
uj=(y+ym)/2;
nj=log(uj);
if i=1 | aux2=5 then par=1; else par=alphaii[i-1,2];
ddpar=1; jj=0; count=0; aux2=0;
do while (abs(ddpar)>0.000001);
aux1=0;
dpar=1;
parold=par;
if gwr="global" | gwr="poisson" then do;
dpar=0.00001;
if gwr= "global" then par=1/a;
end;
/* computing alpha=1/par, where par=theta */
do while (abs(dpar)>0.001);
aux1=aux1+1;
if gwr="local" then do;
par=choose(par<1E-10,1E-10,par);
g=sum((digamma(par+y)-digamma(par)+log(par)+1-log(par+uj)-(par+y)/(par+uj))#w[,1]);
hess=sum((trigamma(par+y)-trigamma(par)+1/par-2/(par+uj)+(y+par)/((par+uj)#(par+uj)))#w[,1]);
end;
hess=choose(abs(hess)<1E-23,sign(hess)*1E-23,hess);
hess=choose(hess=0,1E-23,hess);
par0=par;
par=par0-inv(hess)*g;
if par<=0 then do;
count=count+1;
if count<10 then par=0.000001;
else par=abs(par);
end;
if aux1>50 & par>1E5 then do;
dpar= 0.0001;
aux2=aux2+1;
if aux2=1 then par=2 ;
else if aux2=2 then par=1E5;
else if aux2=3 then par=0.0001;
end;
else do;
dpar=par-par0;
if par<1E-3 then dpar=dpar*100;
end;
end;
if gwr= "poisson" then alpha=0;
else alpha=1/par;
dev=0; ddev=1; cont=0;
/* computing beta */
do while (abs(ddev)>0.000001);
cont=cont+1;
uj=choose(uj>1E100,1E100,uj);
aux= (alpha*uj/(1+2*alpha*uj+alpha*alpha*uj#uj));
Ai=(uj/(1+alpha*uj))+(y-uj)#aux;
Ai=choose(Ai<=0,1E-5,Ai);
zj=nj+(y-uj)/(Ai#(1+alpha*uj)) - offset;
if det(x`*(wi#Ai#x))=0 then bi=j(ncol(x),1,0);
else bi=inv(x`*(wi#Ai#x))*x`*(wi#Ai#zj);
nj=x*bi + offset;
nj=choose(nj>1E2,1E2,nj);
uj=exp(nj);
olddev=dev;
uj=choose(uj<1E-150,1E-150,uj);
tt=y/uj;
tt=choose(tt=0,1E-10,tt);
if gwr= "poisson" then dev=2*sum(y#log(tt)-(y-uj));
else dev=2*sum(y#log(tt)-(y+1/alpha)#log((1+alpha*y)/(1+alpha*uj)));
if cont>100 then ddev= 0.0000001;
else ddev=dev-olddev;
end;
jj=jj+1;
if gwr="global" | gwr="poisson" | aux2>4 | jj>50 | ddpar=0.0000001 then ddpar=1E-9;
else do;
ddpar=par-parold;
if par<1E-3 then ddpar=ddpar*100;
end;
end;
Ai2=(uj/(1+alpha*uj))+(y-uj)#(alpha*uj/(1+2*alpha*uj+alpha*alpha*uj#uj));
if Ai2[><,]<1E-5 then Ai2=choose(Ai2<1E-5,1E-5,Ai2);
Ai=Ai2;
if det(x`*(wi#Ai#x))=0 then S[i,]=j(1,n,0);
else S[i,]= x[i,]*inv(x`*(wi#Ai#x))*(x#wi#Ai)`;
yhat[i]=uj[i];
alphaii[i,1]=i;
alphaii[i,2]= alpha;
end;
alpha= alphaii[,2];
yhat=choose(yhat<1E-150,1E-150,yhat);
tt=y/yhat;
tt=choose(tt=0,1E-10,tt);
if gwr= "poisson" then dev=2*sum(y#log(tt)-(y-yhat));
else dev=2*sum(y#log(tt)-(y+1/alpha)#log((1+alpha#y)/(1+alpha#yhat)));
if gwr ^= "poisson" then do;
a2=y+1/alpha; b2=1/alpha; c2=y+1;
end;
else do;
a2=y; b2=1/(alpha+1e-8); c2=y+1;
a2=choose(a2=0,1E-10,a2);
end;
algamma=j(n,1,0); blgamma=j(n,1,0); clgamma=j(n,1,0);
do i=1 to nrow(y);
algamma[i]=lgamma(a2[i]); blgamma[i]=lgamma(b2[i]); clgamma[i]=lgamma(c2[i]);
end;
if gwr^="poisson" then do;
ll=sum(y#log(alpha#yhat)-(y+1/alpha)#log(1+alpha#yhat)+ algamma - blgamma - clgamma );
npar=trace(S)+1;
end;
else do;
ll=sum(-yhat+y#log(yhat)-clgamma);
npar=trace(S);
end;
/*AIC= 2*npar + dev;*/
AIC= 2*npar -2*ll;
AICC= AIC +(2*npar*(npar+1))/(n-npar-1);
CV=(y-yhat)`*(y-yhat);
res=cv||aicc||npar||dev;
return (res);
finish;
if type="cv" then do;
pos=1;
create &out var{h1 res1 h2 res2};
end;
else do;
if type="aic" then pos=2;
else pos=4;
create &out var{h1 res1 npar1 h2 res2 npar2};
end;
res1=cv(h1); npar1=res1[3]; res1=res1[pos];
res2=cv(h2); npar2=res2[3]; res2=res2[pos];
append;
do while(abs(h3-h0) > tol*2);
if res2<res1 then do;
h0=h1;
h1=h2;
h2=c*h1+r*h3;
res1=res2;
npar1=npar2;
res2=cv(h2);
npar2=res2[3];
res2=res2[pos];
end;
else do;
h3=h2;
h2=h1;
h1=c*h2+r*h0;
res2=res1;
npar2=npar1;
res1=cv(h1);
npar1=res1[3];
res1=res1[pos];
end;
append;
end;
if method= "adaptive1" then do;
xmin = (h3+h0)/2;
h2=ceil(xmin);
h1=floor(xmin);
golden1 = cv(h1);
g1= golden1[pos];
golden2= cv(h2);
g2= golden2[pos];
npar1=golden1[3];
res1=golden1[pos];
npar2=golden2[3];
res2=golden2[pos];
append;
if g1<g2 then do;
xmin=h1;
npar=golden1[3];
golden=g1;
end;
else do;
xmin=h2;
npar=golden2[3];
golden=g2;
end;
end;
else do;
xmin = (h3+h0)/2;
golden = cv(xmin);
npar=golden[3];
golden=golden[pos];
end;
h1 = xmin;
res1 = golden;
npar1=npar;
h2 = .;
res2 = .;
npar2=.;
append;
if type="cv" then print golden xmin;
else print golden xmin npar;
quit;
%mend golden;
In general when using someone else's macro you want to run the code with the OPTIONS MPRINT; turned on. That way any messages appear in better relation to the code and examine the log for items like 0 records returned.
Another thing is to check your data set after import. Proc Import will guess as to variable type and may get it wrong. So if the macro expects a given parameter to be numeric but the variable is character (or vice versa) then you can get unexpected or no results. Also the names of your variables might need to be double checked.
You show the code of the macro but no statements that actually call the macro. Only the definitions. So maybe that's an issue.
I would expect as a minimum:
%golden
after the macro definitions since you appear to have made all of the parameters default keyword parameter values.
It may help to show the LOG after running the macro.
Hi ballardw,
Thanks for the solution. I managed to run the code, but after one hour the process was terminated with an error message:
An exception was thrown while sending a packet to the peer.
Does it mean that my hardware could not support the calculation?
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.