Solved: Re: Can you coalesce variable values across two different fields

wlierman · Posted 06-17-2021 05:48 PM

I have fifteen fields each representing a CCO (Medicaid health provider) something like this (the letters indicate that there is a name of the CCO showing up in that field. The coalesce column (Coalesce have) shows what values are shown. It is okay but in row 4 and 6 two CCOs are displayed. The reason is that the CCOs offer services across the same zipcodes (splitting the zipcode into respective service areas).

Is there a coding approach that will still perform the coalesce (like in column Coalesce have) but for rows with two CCOs show both names in the coalesce field (like in column Coalesce want)? Then when I do a proc freq by contacts and CCO I would like the double CCOs to show in the freq as C | D and H | I.

Thank you for your help again.

wklierman

Coalesce Coalesce

CCO1 CCO2 CCO3 . . . CCO9 CCO10 . . . CCO14 CCO15 have want

A A A

B B B

G G G

C D C C | D

J J J

G G G

H I H H | I

B B B

Reeza · Posted 06-17-2021 05:50 PM

CATX().
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lefunctionsref/n0p7wxtk0hvn83n1pveisbcp2ae9.h...

want = CATX("|", of CCO1-CCO15);

View solution in original post

Reeza · Posted 06-17-2021 05:50 PM

CATX().
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lefunctionsref/n0p7wxtk0hvn83n1pveisbcp2ae9.h...

want = CATX("|", of CCO1-CCO15);

wlierman · Posted 06-17-2021 07:13 PM

Thank you. The pipe was only an example. But I want the coalesce as here, but only in the few cases where there is a value in two CCOs for a particular contact would I like the "combined" coalesce value. Thank you

Proc Sql noprint;
CREATE TABLE SASCDC_2.Arias_skinny_contact_for_CCO  AS
   SELECT DISTINCT Contact_Person_ID AS Contact_Person_ID label='Contact_Person_ID',
                  /* Monitoring_Status,
				              Status,
							 County_1,
							 ZIP_Code,
						 Jurisdiction,
						  Designation,*/
      Coalesce (CCO_1, CCO_2, CCO_3, CCO_4, CCO_5, CCO_6, CCO_7,
                CCO_8, CCO_9, CCO_10, CCO_11, CCO_12, CCO_13,
			    CCO_14, CCO_15) AS CCO
   FROM SASCDC_2.Arias_contact_zip_cco_zip
   Order By Contact_Person_ID;
quit;

ballardw · Posted 06-17-2021 07:29 PM

@wlierman wrote:

Thank you. The pipe was only an example. But I want the coalesce as here, but only in the few cases where there is a value in two CCOs for a particular contact would I like the "combined" coalesce value. Thank you

What in the data tells us when you might want 2 (or more) values concatenated? Which "few cases"?

Sounds like a CASE statement would be required to do the conditional coalescec / catx.

Your example data strongly implies that most of those CCO variables are missing and you were only selecting the ones with values. If that is not the case then your example data should represent your data more clearly. Or if the case is such that the data actually is mostly missing the CATX function will return the only populated variable value. CATX strips trailing blanks from value for the concatenation, so if a variable is missing then it is completely stripped out. The delimited used only ever appears if there are two or more values (of if the single value of the variable happens to be the delimiter chosen).

wlierman · Posted 06-18-2021 12:50 AM

There are 165,000 contacts and most have one CCO that they are assigned to. There are a few (say 100) that have 2 CCOs because as I mentioned because a zipcode will be split between the CCOs. So of the 165K about 135K have a CCO - zipcode or county is missing. This is survey data so much of the data whether REALD or demographic will have some missing or incomplete responses.
So my question is if I use the Coalesce statement in the query do I code a subsetting "if"condition in a data step before the query? I want a condition that catches the 100 or so obs (contacts) that happen to have two CCOs.

Thank you for your help.

wlierman · Posted 06-18-2021 12:53 AM

That should read of 165K overe 135K have a CCO.

ballardw · Posted 06-18-2021 09:25 AM

@wlierman wrote:
There are 165,000 contacts and most have one CCO that they are assigned to. There are a few (say 100) that have 2 CCOs because as I mentioned because a zipcode will be split between the CCOs. So of the 165K about 135K have a CCO - zipcode or county is missing. This is survey data so much of the data whether REALD or demographic will have some missing or incomplete responses.
So my question is if I use the Coalesce statement in the query do I code a subsetting "if"condition in a data step before the query? I want a condition that catches the 100 or so obs (contacts) that happen to have two CCOs.

Thank you for your help.

Still have described one or more rules that tell us which "contact" might need the 2 CCO. Your example data does not show Zipcodes at all. And now you need to define exactly what is meant by "Zipcode will be split between CCO". Especially since you say Zipcdoes may be missing for up to 75% of the records. That tends to make it a poor decision variable.

Coalesce will never return two values. It returns the first non-missing in a list of values. So I don't get why you keep saying coalesce. The CATX function will do that. Have you tested it? You would need to provide more complete data as to why it is not appropriate.

wlierman · Posted 06-18-2021 09:24 AM

With a little bit of follow-up, the catx does the trick.

Thanks again.

wklierman

himself · Posted 06-17-2021 10:57 PM

I think you can have

If cmiss(cc:,have) gt 2 then want= catx("|", of cc:);
else want=coalesce(of cc:);

ChrisNZ · Posted 06-18-2021 01:43 AM

I don't understand how @Reeza 's suggestion of using catx() does not answer your need.

High-Performance SAS Coding - Third Edition

wlierman · Posted 06-18-2021 01:47 AM

My question is does that code go into the query or a data step before the query?

Classroom Training Available!