John Doe EGFR RS2R John Doe EGFR 5539 Jane Williams BRCA1 6006 Jane Williams BRCA1 2002 Tom Ford BRCA1 4008 Tom Ford BRCA1 2343 Tom Ford BRCA1 2343 Tom Ford EGFR 6382 Luis Mo ALK1 8373 Luis Mo EGFR 3378 Katie Lu BRCA1 3873 katie Lu EGFR 8739 This is a tiny example of what my dataset looks like. I have 2,000 different individuals, each with their own different set of mutations. I want to find the frequency of each mutation (ex. EGFR, BRCA1, etc ) in my total population, regardless of what location the mutation is on the gene (so I don't care about the numbers). So I wanted to find an easy way to fine the freuquency of EGFR mutations, by grouping all the EGFR XXXX into one variable category (EGFR), the BRAC1 XXXX into another, etc... without having to manually do it for 2,000 people. I would like my output to show the percentage of each mutation in my population with the new variables I create. I'm sorry my explanations are poor, as you can see I am definitely a SAS beginner and have not mastered a lot of the data management aspect. Thank you!
... View more