The warning comes about from the fact that you have more levels in all of your independent variables than you have observations. You will have to reduce your model in some fashion--in particular, fitting an unstructured working covariance leads to n*(n-1)/2 parameters to estimate in addition to all of the independent parameters.
On a different note, if you wish to accommodate the clustering implied by site and patient, you should consider using PROC GLIMMIX, with those as random effects. There is a lot of material both here and elsewhere on the web regarding good code for that approach. Still, the key is identifying a tractable model.
Steve Denham
... View more