I have a data set and I have 36 counties and 15 CCOs. Would like to use proc freq efficiently. I want a count of covid contacts by county and for the requisite CCO (Coordinated Care Organization). Several counties utilize the same CCO so that is why there are fewer CCOs than counties. This is how I renamed the generic CCO1.....CCO15 fields
Proc Sql noprint; CREATE TABLE SASCDC_2.Arias_skinny_dataset_for_viz AS SELECT DISTINCT Contact_Person_ID, Monitoring_Status, Status, ZIP_Code, County_, Designation, CCO_1 AS Advanced_Health label='Advanced_Health', CCO_2 AS Allcare_CCO label='Allcare_CCO', CCO_3 AS Cascade_Health_Alliance label='Cascade_Health_Alliance', CCO_4 AS Columbia_Pacific_CCO label='Columbia_Pacific_CCO', CCO_5 AS EOCCO_ label='EOCCO_', CCO_6 AS Health_Share_Oregon label='Health_Share_Oregon', CCO_7 AS Intercommunity_Health_Network label='Intercommunity_Health_Network', CCO_8 AS Jackson_Care_Connect label='Jackson_Care_Connect', CCO_9 AS Pac_Source_Central_OR label='Pac_Source_Central_OR', CCO_10 AS Pac_Source_Columbia_Gorge label='Pac_Source_Columbia_Gorge', CCO_11 AS Pac_Source_Lane label='Pac_Source_Lane', CCO_12 AS Pac_Source_Marion_Polk label='Pac_Source_Marion_Polk', CCO_13 AS Trillium_Comm_Health_Plan label='Trillium_Comm_Health_Plan', CCO_14 AS Umpqua_Health_Alliance label='Umpqua_Health_Alliance', CCO_15 AS Yamhill_Comm_Care_Org label='Yamhill_Comm_Care_Org' FROM SASCDC_2.Arias_contact_zip_cco_zip WHERE Monitoring_Status ne "Marked for deduplication"; quit;
So the proc freq looks like this (with only a few CCos listed)
proc freq Data = SASCDC_2.Arias_skinny_dataset_for_viz order=data;; Tables County_ * Advanced_Health Allcare_CCO Cascade_Health_Alliance Columbia_Pacific_CCO EOCCO_; run;
But the problem is I don't want a table for each CCO. I want the count of contacts occurring in a specific county:CCO cell. So ideally I would like my resulting table to look like (illustrative)
Advanced Health CCO …………………. Health Share Oregon …. Yamhill Comm Health
Baker 23 0 0
Hood River 0 100 0
Multnomah 0 1200 0
Polk 0 0 200
….
Washington 0 1550 0
How can I create that table using proc freq? Or can I. This table is a beta version so I don't want to spent a lot of time
(this time around - like proc report or tabulate). This is to give leadership a new slice of information and to assess how much
value there is.
Thank you.
wlierman
Text without program means I don't have your data. Only enough records to get two or three of your "cco" would be needed. I am guessing that your data continues to have the text of the name. So perhaps:
Proc Sql noprint; CREATE TABLE SASCDC_2.Arias_skinny_dataset_for_viz AS SELECT DISTINCT Contact_Person_ID, Monitoring_Status, Status, ZIP_Code, County_, Designation, Coalesce (CCO_1,CCO_2,CCO_3,CCO_4,CCO_5,CCO_6,CCO_7 ,CCO_8,CCO_9,CCO_10,CCO_11,CCO_12,CCO_13 ,CCO_14,CCO_15) as CCO FROM SASCDC_2.Arias_contact_zip_cco_zip WHERE Monitoring_Status ne "Marked for deduplication"; quit; proc report data=SASCDC_2.Arias_skinny_dataset_for_viz ; columns county_ CC0; define county_ / group; define cco / across; run;
Will give a chance to see what I mean. You could add a nice label for the CCO simply by adding a quoted string after the across such as "CCO Name" or similar.
There are a number of approaches to addressing the way missing values in the body will appear. Easiest might be to add the line: Options missing='0'; just before the proc report and then reset the missing charact to . with Options missing='.'; after. Or if you prefer a blank instead of zero Options missing=' ';
We are getting a default N statistic with this approach.
Some example data would be nice.
One issue is that your data is poorly structured. You would be better off with a single variable like 'CCO' with a VALUE of the text, or similar, that you are assigning to variable names and a value like "count" (may not be needed if you have one record per contact). It is often a bad idea to store actual data in the name of a variable.
Data structured as I suggest would be easy with either Proc Report or Proc Tabulate.
Maybe something like:
Proc report data=SASCDC_2.Arias_skinny_dataset_for_viz;
columns county Advanced_health Allcare_cco <list all those cco>;
define county /group;
run;
If those CCO are numeric values then you will get a sum per county.
If they are character and one per conact then you will get a count. If something else, then we need to know more of what your actual data looks like.
Thank you for the reply. I think I know what you mean. If I had done something like go by field name for each CCO and substitute a 1 each time the name appeared then I could use your suggestion with Proc report. Would that have worked?
I want to be sure that I understand your technique and suggestion, since this is going to come up again.
I have attached 15K rows of a csv file that contains the dataset that I am working with. I would appreciate seeing your steps to clean the data to get it ready for the Proc report code that you suggest.
Thank you for your help.
wlierman
Text without program means I don't have your data. Only enough records to get two or three of your "cco" would be needed. I am guessing that your data continues to have the text of the name. So perhaps:
Proc Sql noprint; CREATE TABLE SASCDC_2.Arias_skinny_dataset_for_viz AS SELECT DISTINCT Contact_Person_ID, Monitoring_Status, Status, ZIP_Code, County_, Designation, Coalesce (CCO_1,CCO_2,CCO_3,CCO_4,CCO_5,CCO_6,CCO_7 ,CCO_8,CCO_9,CCO_10,CCO_11,CCO_12,CCO_13 ,CCO_14,CCO_15) as CCO FROM SASCDC_2.Arias_contact_zip_cco_zip WHERE Monitoring_Status ne "Marked for deduplication"; quit; proc report data=SASCDC_2.Arias_skinny_dataset_for_viz ; columns county_ CC0; define county_ / group; define cco / across; run;
Will give a chance to see what I mean. You could add a nice label for the CCO simply by adding a quoted string after the across such as "CCO Name" or similar.
There are a number of approaches to addressing the way missing values in the body will appear. Easiest might be to add the line: Options missing='0'; just before the proc report and then reset the missing charact to . with Options missing='.'; after. Or if you prefer a blank instead of zero Options missing=' ';
We are getting a default N statistic with this approach.
Thank you for the guidance.
The coalesce function did work. I checked everything and the CCO service areas mapped correctly to county and zipcodes.
Again thank you both ballardw and Reeza for your help.
wlierman
@wklierman wrote:
Thank you for the guidance.
The coalesce function did work. I checked everything and the CCO service areas mapped correctly to county and zipcodes.
Again thank you both ballardw and Reeza for your help.
wlierman
The Coalesce function returns the First value of the list that is populated. You may want to double check to see if the result is correct for any record that had 2 CCO.
My data I would probably have transposed to have one record with the text value as CCO originally. Which would then get a more accurate count if there are two or more CCO per record.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.