DATA Step, Macro, Functions and more

improve on a SAMPLE

Reply
Valued Guide
Posts: 2,177

improve on a SAMPLE

-----> just for fun.


a really useful resource provided (and fully funded) by SAS Instiutute is the SAS Samples library.

One of the great ways to learn new ideas is to watch the RSS feed for interesting topics.

I spotted at the top the SAMPLE rss feed

Sample 22872: Determine which counties border a specific county

As with all Samples this is a tabbed webpage with the FullCode tab revealing code you can run yourself to try it out.

-----> just for fun.

It is a personal challenge of mine, to see if I can add some value by an alternative approach: and this time I think I might

 

Rather than 6 steps and a proc print, here is one SQL step implementing the same search algorithm described in the SAMPLE

"Assuming that each bordering polygon shares at least one common point, determine the neighboring counties for a given county"

proc sql ;

create table adjacent_counties as

select distinct

        a.state     as state1

      , b.state     as state2

      , a.county    as county1

      , b.county    as county2

      , c.statename as stateN1

      , d.statename as stateN2

      , c.countynm  as countyN1

      , d.countynm  as countyN2    

  from MAPS.USCOUNTY a

  join MAPS.USCOUNTY b

    on a.x = b.x

   and a.y = b.y

   and a.county NE b.county

  join ( select distinct statename, countynm, county, state

           from maps.cntyname)  c

    on a.state   = c.state

   and a.county  = c.county

  join ( select distinct statename, countynm, county, state

           from maps.cntyname)  d

    on b.state   = d.state

   and b.county = d.county

where b.county ne a.county

      ;

quit ;


With no STATE filters, it produces a table of about 18K rows of adjacent counties, in about half a second

However, if we seek the regions which might not share boundary points but are still adjacent or overlap, how much more complex would does the query become?

I'm struggling with the coordinate maths for the polygons.

It is easy enough for a data step to provide a table with start and endpoint for each line in the polygons of the map.  Point X0/Y0 to point X/Y.

This code generates the set

data stepf ;

keep   state county segment spot x0 x y0 y ;

retain state county segment spot x0 y0  ; ** just to control VAR order ;

do until( last.state) ;

do until( last.county ) ;

call missing(x0, y0 ) ;  ** start point ;

do spot = 0 by 1 until( last.segment ) ;

  set maps.uscounty end= eof ;

  by state county segment notsorted ;

* where state in(11,24) ;

  if spot then output ;

  else do ; * save starting point ;

           xs = x ;

           ys = y ;

       end ;

  x0 = x;

  y0 = y;

end;

  x = xs ; *** final point on segment connects to first;

  y = ys ;

  output ;

end;end;

if eof then stop ;

run;

but I'm struggling to define cross-over or co-linearity between segment lines of adjacent counties, in terms of  X,Y,x0,y0

-----> just for fun.

peterC

SAS Employee
Posts: 982

Re: improve on a SAMPLE

That appears to be an efficient/concise.

I took the SQL approach too.  But, similar to the SAS tech support example, I broke it up into several pieces.

I find it easier to trouble-shoot, and easier for other people to understand/modify/enhance/maintain that way Smiley Happy

My example finds the counties adjacent to the county a given city is in, and also displays them on a map...

Counties bordering Cary, NC (SAS/Graph gmap) <-- sample output

http://robslink.com/SAS/democd26/adjacent_info.htm <-- link to the code

(I don't have an answer for finding "adjacent" counties that don't share common border points.)

Valued Guide
Posts: 2,177

Re: improve on a SAMPLE

Posted in reply to RobertAllison_SAS

Thank you Rob

     rounding through converting to a string is an interesting approach to improving the proximity of points. I'm not sure how much distance is implied beyond the 5-th dec place but it can't be much.

Line approach and intersection and overlap problems:

1

for example, with no common map points, the lines A(Ax1,Ay1 to Ax2,Ay2) and B(Bx1,By1 to Bx2,By2) share an overlapping horizontal line, indicating a shared boundary

..................Ax1...........Bx1......................Bx2....................Ax2.........  (the Ay and By are not shown because for this simple example all are equal.)

so might not feature among any of our adjacent counties

2

unless the polygons of the maps are created like a honeycomb with all adjacent boundaries sharing points, I expect there will be places where polygons might intersect or cross over, at corners for example: - would the mapping procedures reject these maps as invalid?

peter

Peter

SAS Employee
Posts: 982

Re: improve on a SAMPLE

Some mapping & GIS software cares whether the map areas are "topologically correct" (with no overlapping polygons, and shared borders also share the exact same points, etc). 

But SAS/Graph Proc Gmap does not care, and draws whatever you give it (I sometimes create overlapping polygons intentionally, for special effects, or when I'm lazy)  Smiley Happy

Ask a Question
Discussion stats
  • 3 replies
  • 229 views
  • 3 likes
  • 2 in conversation