Data visualization with SAS programming

Contiguous CBSA Identification

Accepted Solution Solved
Reply
Contributor
Posts: 39
Accepted Solution

Contiguous CBSA Identification

[ Edited ]

Hello,

 

I am working on a map of the United States with CBSA lines set in SAS EG (I think v9?). This is really my first time trying to tweak with the code other than changing patterns/colors/titles etc. so I don't have much experience to work off of. Basically, I am trying to use the CBSA codes I have in combination with the CBSA locations that are drawn on the map to make a new dataset that has each CBSA code and ALL of the CBSAs that it is contiguous to (borders touch) in a horizontal dataset.

 

Example scenario. There are four CBSAs, 10000,10001,10002,10003. 10003 touches all the others, 10002 touches 10003 and 10001, 10000 only touches 10003. From this, I want a dataset that looks like this:

 

MAIN_CBSA      CONT_CBSA1   CONT_CBSA2   CONT_CBSA3

10000                 10003

10001                  10002                  10003

10002                  10001                  10003

10003                  10000                  10001                   10002

 

 

The on idea I had was to take the coordinates dataset I have and try to take observations that have the same x and y values and write wide the cbsa code for each but I don't know how to do this.

 

data that I have:

 

x          y             cbsa_name              cbsa_code

1          2               New York                 10001

1          2               Brooklyn                   10002

 

data I want:

x          y            cbsa_name               cbsa_code                 cont_cbsa1

1          2              New York                   10001                        10002

1          2              Brooklyn                     10002                        10001

 

 

I just don't know how to generate variables for each observation based off of the x and y.

 

The order of the cont_cbsa* variable doesn't matter. I just need them all in a row, because eventually I will be pulling a count of patients (for medicare) across each of the CBSAs and want to know whether a sum of the main_cbsa and any of the others equals 500+. If this isn't doable that's totally fine I just thought I'd give it a shot.

 

Thanks


Accepted Solutions
Solution
‎06-23-2016 09:27 AM
Contributor
Posts: 39

Re: Contiguous CBSA Identification

I think I basically figured it out actually. The x and y are the coordinate system used for each boundary. I just that in every case when a boundary shares two points the cbsas are contiguous.

 

data boundaries2;set boundaries;

keep x y cbsa_cbsa;

run;

proc sql; create table boundary_test as

select distinct a.cbsa_cbsa,b.cbsa_cbsa as cbsa2

from boundaries2 a

left outer join boundaries2 b on a.x=b.x and a.y=b.y;

quit;

proc transpose data=boundary_test prefix=cont_cbsa;

by cbsa_cbsa;

var cbsa2;

run;

View solution in original post


All Replies
Grand Advisor
Posts: 9,719

Re: Contiguous CBSA Identification

What will you actually do with the resulting data set in more detal? The type of structure you are proposing is often much harder to work with in the long run.

 

It sounds like you are looking to create "groups" but your "results" makes two different groups by name with the same components and that usually doesn't make sense.

 

Can you provide some dummy data of both inputs and the desired result?

 

 

Solution
‎06-23-2016 09:27 AM
Contributor
Posts: 39

Re: Contiguous CBSA Identification

I think I basically figured it out actually. The x and y are the coordinate system used for each boundary. I just that in every case when a boundary shares two points the cbsas are contiguous.

 

data boundaries2;set boundaries;

keep x y cbsa_cbsa;

run;

proc sql; create table boundary_test as

select distinct a.cbsa_cbsa,b.cbsa_cbsa as cbsa2

from boundaries2 a

left outer join boundaries2 b on a.x=b.x and a.y=b.y;

quit;

proc transpose data=boundary_test prefix=cont_cbsa;

by cbsa_cbsa;

var cbsa2;

run;

Grand Advisor
Posts: 9,719

Re: Contiguous CBSA Identification

If you are looking to combine these you could add a region (or similar idea) to your boundary data (make a NEW set in case of errors) based on the values of your CBSA.

 

The Proc Gremove can be used to remove internal boundaries and just show the region.

 

Be very careful processing map boundary data sets with SQL. The results often are not in the order required to draw a map afterward.

Post a Question
Discussion Stats
  • 3 replies
  • 173 views
  • 0 likes
  • 2 in conversation