PRIMARY KEY ADDRESS SEX AGE ZIP DISEASE EVENT DATE AA12345 1 SAMPLE ST, NY M 20 00000 x 03/01/2008 AA12345 1 SAMPLE ST, NY M 20 00000 x 03/01/2008 AA12345 1 SAMPLE ST, NY M 20 00000 x 03/01/2008 2 SAMPLE ST, FL F 21 12345 Y 04/01/2008 2 SAMPLE ST, FL F 21 12345 Y 04/01/2008 3 SAMPLE ST, CA F 22 22222 Y 05/01/2008 BD6789 M 19 33333 Z 06/01/2008 AA12345 1 SAMPLE ST, NY M 20 00000 x 03/01/2008 3 SAMPLE ST, CA F 22 22222 Y 05/01/2008 I am facing a dilemma with fixing a data set. The raw data I have received is on SQL file that I have no capability of changing (or at least they are not letting me just to see how I can manage things with SAS). My data set looks like this with around 700,000 observations. The data has a lot of duplicates and I want to get rid of them. As you can see…some have primary keys, some do not. Some have addresses, some do not. The only ones that everyone has are sex, age, zip, disease and event date. I tried sorting the data with the code below but to no avail. Please help me –the SAS newbie. PROC SORT; DATA = NEW NODUPRECS; BY _ALL_; RUN;
... View more