I have a data set that contains cases and matched controls (most cases are matched 1:4 with controls). Each case-control group is identified by having the same ID number following with "X" for case and "A" "B" "C" "D" for controls. Example (GI001X GI001A GI001B GI001C). I would like to make a new dataset that only contains giardia cases and the matching controls. Is there a way to subset my data based on ID number? Basically I want to keep observations where "Giardia"=1 and "Case"=1 (this combination denotes giardia cases) as well as the matching controls in which "Case"=0. The tricky part is, not all the ID numbers are the same length- they range from 2-4 numbers followed the X or A,B,C,D.
... View more