I am analyzing the count of insurance claims paid, by year in which the accident occurred and year in which the claim was paid. For example, policyholder A may have had 5 claims total, 3 in 2008 and 2 in 2009. Of the three claims for accidents occurring in 2008 one was paid in 2008, another one in 2009, and the other one in 2010. Of the the two claims for accidents occuring in 2009, one was paid in 2009 and the other one in 2010.
I am having trouble figuring out how to order these observations. Clearly the claim for an accident occuring in 2008 that was paid in 2008 should go first. Then we have 2 claims paid in 2009: one from an accident that occurred in 2008 and another one from an accident that occurred in 2009. Which of those two claims should be next? Also, the lag between these two claims is less than the lag between each of them and the claim paid in 2008 for an accident that occurred in 2008. Would this cause a problem?
Perhaps a better question would be: Is it necessary for the subject and withinsubject variables to identify a unique observation. In the previous example, if I let the policyholder be the subject, and the year the claim was made be the withinsubject identifier, there would be cases with more than one observation having the same values for the subject and withinsubject variables. For example there would be two observations with policy id A and 2009 as the year in which the claim was paid. Is that ok?
Never mind, I just realized the withinsubject variable must have as many levels as there are observations for the subject.