I have a data set than consists of testday milk, butterfat (%) and protein (%) yield records in dairy cattle. To clarify, my data (variables)look as follows: animal days_in_milk milk_yield butterfat_% protein_% 1234567 35 30.8 3.51 3.11 1234567 65 39.2 3.32 3.09 1234567 95 38.5 3.21 3.02 1234567 125 32.7 3.15 3.06 1234567 125 32.7 3.13 3.06 1234567 155 30.2 3.05 3.10 1234567 185 28.2 3.08 3.12 Where, animal is a unique id number of the cow, days_in_milk (dim) was calcated from test date minus calving date, milk_yield is the yield on the applicable test day, butterfat_% is the percentage of butterfat of the milk sample on the applicable test day, protein_% is the protein percentage of the milk sample on the applicable test day. Marked in red is my problem. In my data a cow have muliple records signifying records on different test dates (hence, days in milk or dim). But there are mistakes, as marked in red. Clicking on "select distinct rows only" when using the query builder does not solve the problem, because it sees the two records marked in red as different (while it is actually a duplicate), because only the butterfat % differs slightly (maybe because of a mistake?). How do I remove ONE of these records using SAS enterprise guide 7.1 (or 6.1)? Because unfortunately, I am out of my depth with programming. Please help! 🙂
... View more