Is it possible to use a Random Forest on repeated measures? I have Medicare Part D claims data for the years 2013-2017 where I am interested in finding out predictors for a specific type of prescription. The dataset has every physician for those 5 years, with the variables measured each year. I want to cluster it by physician level (using NPI codes) and perhaps states. If this isn't possible, I'll likely choose just choose my variables based on literature. The goal is use RF to choose my independent variables and then use a zero-inflated negative binomial regression or Poisson regression for the analysis/interpretation (likely ZINB since there's quite a bit zeroes for the dependent variable). Dependent variables: claim counts for a specific drug (count data).
