Hello,
I need some help summarizing a relatively small dataset by the state code that is in the data. I have locations both inside and outside of the U.S. and I'd like to summarize this so that I can see how many records are in the U.S. and how many are in different countries.
Sample:
ID | STATE |
1 | CA |
2 | FC |
3 | |
4 | GB |
5 | AB |
6 | FL |
7 | AB |
I'd like to have something output that shows the state and the frequency, but ONLY if the frequency is over 25.
Then I'd like to have it grouped by in U.S. vs. Outside of U.S. Is there a way to do this without individually calling out every state/territory in a data step?
PROC FREQ with FORMAT.
Create a user format that maps each state to US and everything else to a different code. Then when you do the PROC FREQ you can apply that format and have it displayed as desired.
https://github.com/statgeek/SAS-Tutorials/blob/master/proc_format_example.sas
EDIT: If you're having trouble creating the look up table, SASHELP/SASMAPS or MAPSGSFK may have the data you need and you can create a format from one of those data sets.
Create a format to define which states are internal and which external:
proc format lib=work;
value $statetyp
'CA' = 'EX'
'FC' = 'US'
'GB' = 'EX'
... /* correct mistakes */
otherwise = 'er' /* as error to be checked */
; run;
then summarize using the format:
proc means data=have;
class state;
format state statetyp. ;
var <list of variables to sum> ;
output out=want sum=;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.