02-07-2014 08:45 AM
Ok, bare with me here, I have only been using SAS for 2 weeks, so I realise some of my sentences will sound like soeone tying to learn a new language.
But here goes.
I have gathered data from our kennel club, one data set containing veterinary information (if the dog has a specific disease (PL= patellar luxation), and in that case, which degree, 0-3), and another 14 for each one of 14 breeds, containing names, registration numbers, lineage ect ect. I ended up with a dataset contaiing 250 000 observations, divided into 14 variables.
I have merged these files by breed code, registratio number, birth date and sex.
I have done some other ifs and formats to get what I want. So far so good.
I proceeded to see how many examinations were done by each veterinarian (variable called clinic). Also I wanted to see the frequencies of degrees (variable = degree) between each veterinarian.
So I wrote the following;
Proc sort data=plallbreeds;
by clinic degree;
PROC FREQ Data=plallbreeds ORDER=FREQ;
Title 'bla bla bla';
And again, so far so good! Only problems is now that I have 390 veterinarians in my end result, and many of these have done less then 10 exams.
I would like to be able to exclude these from my output so I can better overview the result.
I have searched and searched but I just cant get the code right to do this.
Can someone here help me?
02-07-2014 09:23 AM
If you just want the basic frequency count, you could use SQL:
select clinic, degree, count(*) as Freq
group by clinic, degree
order by clinic, degree;
02-07-2014 10:51 AM
If you are happy with what you have so far, except that you want to eliminate the low counts, here is a way to modify the approach. Instead of printing the report with PROC FREQ, create a data set holding the results. The modification would be:
tables degree / noprint out=clinic_counts;
Then use PROC PRINT to print selected observations. For example:
proc print data=clinic_counts;
where count >= 10;
*by, var, title statements as appropriate;