I'm not sure where to look for help with this problem. I don't even know the right search terms for it.
First let me describe the analysis.
I have land use data from satellite imagery (individual pixels or cells) for years 1985 and 1990. I am recoding development as 1 and non-development (forest, etc) as a zero. I am attempting to predict the probability of transition by cell over these 5 years from 0 to 1 as a function of several variables (slope, etc). I'm quite sure there will be spatial autocorrelation since development breeds new development. I know of ways to deal with this (autoregression, mixed models). I also expect a time dependency, if this is the correct term, for the following reason. Development that occurs in 1986 will affect future time periods and 1987 will also, etc. So, the dependent variables are not independent from each other either in space or time. I do not know from the data what year a particular cell was developed only that it occurred between 1985 and 1990. So, development will occur near other development that occurred prior to it, but I don't know the time ordering other than to say that development either occured before 1985 or between 1985 and 1990.
I can think of a way to do this via simulation by starting at 1985 and assuming a basic relationship between development, past development, and other covariates (like slope). I would then apply this relationship to the landscape as of 1985 to assign probabilities of development. Next, I could develop some cells using a random number generator (creating a binary landscape that is a possible realization of the model) and repeat the process for 1986 and each year until 1990. I could do this many times and see the frequency with which each cell was developed and use this to estimate the likelihood. I could then adjust the functions in an attempt to maximize likelihood. This is very brute-force and I don't want to do it, if avoidable.
If you had the data for each year you could just recode and augment the development variables as lagged variables for each subsequent year and move forward.
However, since you do not have the data for 1986-1989, the problem with any approach you use is the simple fact that you (I assume) have no points of verification other than the 1985 and 1990 points.
What I am getting at is that ,for example, there is no way to confirm the difference in these two temporal development patterns:
No more than you can use the data to tell the difference in a spatial area between:
A. Slow steady growth of effective radius by 10 miles each year
B. Rapid expansion of effective radius of 60 miles in 1987 followed by no more growth.
I feel for you, this is a classic example of being asked to build a path dependent analysis with no verification points on the path other than the start and end. You could build any model you want, no one can challenge its validity.
Which leads me to my next but related question, how are you going to know you are right in your analysis? You have no points of verification. For example, how are you going to possibly know if you way overestimated the probabilty in 1986 and way underestimated it in 1987? Since you are only looking at 1985 and 1990 you don't know how well you did each year without some sort of external verification.
As for the constructive part of my response, you can augment lagged variables and run correlations on them, if you had a means to verify.
Thanks so much for the thoughtful reply. One thing I neglected to mention:
I have land use data for years 1985, 1990, 1995, and 2002. I mentioned only the first two because I was planning to use the 1985 to 1990 change to calibrate the model and the 1990 to 2002 development change as a validation check. Given your final feedback, I thought that knowing of the other data images may change your reply. Also, are there any good introductions to path dependent analysis? I've searched a bit online to no avail. Thanks again.
Markov chains don't really apply here. The present state doesn't capture all the information about the past states. You can't trend an expansion or growth process with Markov chains.
For example, if your population is 50,000 right now, you can not use Markov analysis to predict the population next year. You have no idea if it has been growing by 10% each year, or declining by 50% each year.
The problem with this type of analysis is you are being asked to predict the path by only giving two ends, and nothing to verfiy in between.
Last week Monday I was in New York, this week Friday I am in London. Tell me where I was Wedensday of last week? Good luck.