When you use a format with PROC FREQ it will collapse based on the formatted value, but the actual value it stores will be one of the values in the data.
If you don't want that behavior then use the PUT() function to convert the values.
Might be faster to convert AFTER collapsing with PROC FREQ.
proc format;
value $site_desc
'C','X' = 'X'
;
run;
data have ;
input bygrp $ site $ ;
cards;
1 A
1 C
1 X
2 C
3 X
4 A
;
proc freq ;
by bygrp;
tables site / out=want noprint;
format site $site_desc. ;
run;
proc print data=want;
run;
proc print data=want;
format _all_;
run;
data want2;
set want ;
new_site = put(site,$site_desc.);
format _all_;
run;
proc print;
run;
Obs bygrp site COUNT PERCENT new_site
1 1 A 1 33.333 A
2 1 C 2 66.667 X
3 2 C 1 100.000 X
4 3 X 1 100.000 X
5 4 A 1 100.000 A