I am working with a very large dataset relating fishing effort to spatial location. The sampling unit is individual fishing boats; all fishing boats were surveyed on random days with the goal of capturing 20% of the population. When a boat was surveyed, variables collected included the target fish species, how much they caught, how many anglers were aboard the boat, how many days they were out fishing before returning to shore, and the "block" they were fishing in. I want to look at summary statistics by block and species, but I have numerous instances where only one boat was recorded fishing in a given block. I am unsure as to whether I should drop any blocks that have only one observation or even any blocks with fewer than three observations. On the one hand, it seems like those blocks should be dropped due to small sample size or no replication. On the other hand, if the sampling unit is the boat and not the block, it seems like block would be just another dependent variable being collected and should not be dropped. Thank you in advance for any suggestions.
... View more