BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
wklierman
Obsidian | Level 7

I have a data set and I have 36 counties and 15 CCOs.  Would like to use proc freq efficiently.  I want a count of covid contacts by county and for the requisite CCO (Coordinated Care Organization).  Several counties utilize the same CCO so that is why there are fewer CCOs than counties. This is how I renamed the generic CCO1.....CCO15 fields

Proc Sql noprint;
CREATE TABLE SASCDC_2.Arias_skinny_dataset_for_viz AS
    SELECT DISTINCT Contact_Person_ID,
	                Monitoring_Status,
					Status,
					ZIP_Code,
					County_,
					Designation,
					CCO_1 AS Advanced_Health label='Advanced_Health',
					CCO_2 AS Allcare_CCO label='Allcare_CCO',
					CCO_3 AS Cascade_Health_Alliance label='Cascade_Health_Alliance',
					CCO_4 AS Columbia_Pacific_CCO label='Columbia_Pacific_CCO',
					CCO_5 AS EOCCO_ label='EOCCO_',
                    CCO_6 AS Health_Share_Oregon label='Health_Share_Oregon',
					CCO_7 AS Intercommunity_Health_Network label='Intercommunity_Health_Network',
					CCO_8 AS Jackson_Care_Connect label='Jackson_Care_Connect',
					CCO_9 AS Pac_Source_Central_OR label='Pac_Source_Central_OR',
					CCO_10 AS Pac_Source_Columbia_Gorge label='Pac_Source_Columbia_Gorge',
					CCO_11 AS Pac_Source_Lane label='Pac_Source_Lane',
					CCO_12 AS Pac_Source_Marion_Polk label='Pac_Source_Marion_Polk',
					CCO_13 AS Trillium_Comm_Health_Plan label='Trillium_Comm_Health_Plan',
					CCO_14 AS Umpqua_Health_Alliance label='Umpqua_Health_Alliance',
					CCO_15 AS Yamhill_Comm_Care_Org label='Yamhill_Comm_Care_Org'
	FROM SASCDC_2.Arias_contact_zip_cco_zip
	WHERE Monitoring_Status ne "Marked for deduplication";
quit;

So the proc freq looks like this (with only a few CCos listed)

 

proc freq  Data = SASCDC_2.Arias_skinny_dataset_for_viz order=data;;
   Tables County_ * Advanced_Health Allcare_CCO Cascade_Health_Alliance      Columbia_Pacific_CCO EOCCO_;
run;

But the problem is I don't want a table for each CCO.  I want the count of contacts occurring in a specific county:CCO cell.  So ideally I would like my resulting table to look like (illustrative)

 

                                     Advanced Health CCO …………………. Health Share Oregon ….     Yamhill Comm Health

Baker                                     23                                                               0                                                     0

Hood River                              0                                                            100                                                    0

Multnomah                              0                                                          1200                                                    0

Polk                                         0                                                                0                                                200                          

 

….

 

Washington                            0                                                           1550                                                     0

 

 

How can I create that table using proc freq?  Or can I.  This table is a beta version so I don't want to spent a lot of time

(this time around - like proc report or tabulate).  This is to give leadership a new slice of information and to assess how much

value there is.

 

Thank you.

 

wlierman

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Text without program means I don't have your data. Only enough records to get two or three of your "cco" would be needed. I am guessing that your data continues to have the text of the name. So perhaps:

 

Proc Sql noprint;
CREATE TABLE SASCDC_2.Arias_skinny_dataset_for_viz AS
    SELECT DISTINCT Contact_Person_ID,
	                Monitoring_Status,
					Status,
					ZIP_Code,
					County_,
					Designation,
					Coalesce (CCO_1,CCO_2,CCO_3,CCO_4,CCO_5,CCO_6,CCO_7
                         ,CCO_8,CCO_9,CCO_10,CCO_11,CCO_12,CCO_13
                         ,CCO_14,CCO_15) as CCO
	FROM SASCDC_2.Arias_contact_zip_cco_zip
	WHERE Monitoring_Status ne "Marked for deduplication";
quit;

proc report data=SASCDC_2.Arias_skinny_dataset_for_viz ;
   columns county_ CC0;
   define county_ / group;
   define cco / across;
run;

Will give a chance to see what I mean. You could add a nice label for the CCO simply by adding a quoted string after the across such as "CCO Name" or similar.

 

There are a number of approaches to addressing the way missing values in the body will appear. Easiest might be to add the line: Options missing='0'; just before the proc report and then reset the missing charact to . with Options missing='.'; after. Or if you prefer a blank instead of zero Options missing=' ';

 

We are getting a default N statistic with this approach.

View solution in original post

8 REPLIES 8
ballardw
Super User

Some example data would be nice.

 

One issue is that your data is poorly structured. You would be better off with a single variable like 'CCO' with a VALUE of the text, or similar, that you are assigning to variable names and a value like "count" (may not be needed if you have one record per contact). It is often a bad idea to store actual data in the name of a variable.

 

Data structured as I suggest would be easy with either Proc Report or Proc Tabulate.

Maybe something like:

 

Proc report data=SASCDC_2.Arias_skinny_dataset_for_viz;
columns county Advanced_health Allcare_cco <list all those cco>;
define county /group;
run;

 If those CCO are numeric values then you will get a sum per county.

If they are character and one per conact then you will get a count. If something else, then we need to know more of what your actual data looks like.

wklierman
Obsidian | Level 7

Thank you for the reply.  I think I know what you mean.  If I had done something like go by field name for each CCO and substitute a 1 each time the name appeared then I could use your suggestion with Proc report. Would that have worked?

 

I want to be sure that I understand your technique and suggestion, since this is going to come up again.

I have attached 15K rows of a csv file that contains the dataset that I am working with.  I would appreciate seeing your steps to clean the data to get it ready for the Proc report code that you suggest.

 

Thank you for your help.

 

wlierman

ballardw
Super User

Text without program means I don't have your data. Only enough records to get two or three of your "cco" would be needed. I am guessing that your data continues to have the text of the name. So perhaps:

 

Proc Sql noprint;
CREATE TABLE SASCDC_2.Arias_skinny_dataset_for_viz AS
    SELECT DISTINCT Contact_Person_ID,
	                Monitoring_Status,
					Status,
					ZIP_Code,
					County_,
					Designation,
					Coalesce (CCO_1,CCO_2,CCO_3,CCO_4,CCO_5,CCO_6,CCO_7
                         ,CCO_8,CCO_9,CCO_10,CCO_11,CCO_12,CCO_13
                         ,CCO_14,CCO_15) as CCO
	FROM SASCDC_2.Arias_contact_zip_cco_zip
	WHERE Monitoring_Status ne "Marked for deduplication";
quit;

proc report data=SASCDC_2.Arias_skinny_dataset_for_viz ;
   columns county_ CC0;
   define county_ / group;
   define cco / across;
run;

Will give a chance to see what I mean. You could add a nice label for the CCO simply by adding a quoted string after the across such as "CCO Name" or similar.

 

There are a number of approaches to addressing the way missing values in the body will appear. Easiest might be to add the line: Options missing='0'; just before the proc report and then reset the missing charact to . with Options missing='.'; after. Or if you prefer a blank instead of zero Options missing=' ';

 

We are getting a default N statistic with this approach.

Reeza
Super User
COALESCE() won't work as it seems there can be multiple CCO per record. It needs to be a transpose and then sql or freq.
wklierman
Obsidian | Level 7

Thank you for the guidance.

 

The coalesce function did work.  I checked everything and the CCO service areas mapped correctly to county and zipcodes.

 

Again thank you both ballardw and Reeza for your help.

 

wlierman

ballardw
Super User

@wklierman wrote:

Thank you for the guidance.

 

The coalesce function did work.  I checked everything and the CCO service areas mapped correctly to county and zipcodes.

 

Again thank you both ballardw and Reeza for your help.

 

wlierman


The Coalesce function returns the First value of the list that is populated. You may want to double check to see if the result is correct for any record that had 2 CCO.

My data I would probably have transposed to have one record with the text value as CCO originally. Which would then get a more accurate count if there are two or more CCO per record.

Reeza
Super User
It will give you the answer if you only want to count the first CCO in each county. Try using CATX() instead and see if the results differ.

CREATE TABLE SASCDC_2.Arias_skinny_dataset_for_viz AS
SELECT DISTINCT Contact_Person_ID,
Monitoring_Status,
Status,
ZIP_Code,
County_,
Designation,
catx(" - ", CCO_1,CCO_2,CCO_3,CCO_4,CCO_5,CCO_6,CCO_7
,CCO_8,CCO_9,CCO_10,CCO_11,CCO_12,CCO_13
,CCO_14,CCO_15) as CCO
FROM SASCDC_2.Arias_contact_zip_cco_zip
WHERE Monitoring_Status ne "Marked for deduplication";
quit;
Reeza
Super User
Agreed with BallardW - sample data that shows your structure and expected output would be helpful and this is easier to do if you restructure your data. The benefit of that is that it will become a fully dynamic solution so even if you add CCO or counties it should still work.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 913 views
  • 4 likes
  • 3 in conversation