BookmarkSubscribeRSS Feed
tc
Lapis Lazuli | Level 10 tc
Lapis Lazuli | Level 10

The Largest Employer in Every StateThe Largest Employer in Every State

 

Leveraged some old code and scraped some data from a 24/7 Wall St. story that ran earlier this year to cobble together a hexagon tile map of the largest employers in each state. Happy Labor Day, all!

 

*  Fun With SAS ODS Graphics, Labor Day Edition: The Largest Employer in Every State
   Data source: The Largest Employer in Every State 247wallst.com/special-report/2021/04/08/the-largest-employer-in-every-state-3/;

*==> Scrape state info from HTML for pages 2-12 of 24/7 Wall St. story by Grant Suneson;
  
filename in url 'https://247wallst.com/special-report/2021/04/08/the-largest-employer-in-every-state-3/' lrecl=1000;
data states;
length fvi state employer employeesx industry $ 255;
do pg=2 to 12;                                                    * Data starts on page 2 of 12;
  fvi='https://247wallst.com/special-report/2021/04/08/the-largest-employer-in-every-state-3/'||compress(put(pg,2.))||'/';
  do while(not eof); 
    infile in url filevar=fvi end=eof;                            * Use file variable for URL (vary page from 2 to 12);
    input;                                                        * Grab state name?;
    i=index(_infile_,'<p><span class="title-bullet"><strong>');   
    if i>0 then do; 
      state=strip(scan(substr(_infile_,39),1,'<'));
      input;                                                      * Grab largest employer;
      i=index(_infile_,'<span class="value-bullet"><strong>> Largest employer:</strong>');
      employer=strip(scan(substr(_infile_,64),1,'<'));
      input;                                                      * Grab # employees;
      i=index(_infile_,'<span class="value-bullet"><strong>> Employees:</strong>');
      employeesX=strip(scan(substr(_infile_,57),1,'<'));
      employees=input(compress(employeesX,' ,'),15.);
      input;                                                      * Grab industry;
      i=index(_infile_,'<span class="value-bullet"><strong>> Industry:</strong>');
      industry=strip(scan(substr(_infile_,56),1,'<')); 
      output;
      end;
  end;
  eof=0;                                                          * Reset EOF for next file;
end;
stop;                                                             * Stop after pages read;
run;

proc sql;                                                         * Find 2-character state abbreviations;
create table statescodes as 
select z.statecode, s.* from states s 
left join (select distinct statename, statecode  from sashelp.zipcode) z on upcase(s.state)=upcase(z.statename);

*==> Create a US State Map of Hexagons (United Polygons of America!)
     US tile map coordinates from Matt Chambers drive.google.com/file/d/0B24QRdfcy_JJZFRKLUZzMHZYbG8/view;
     
data StateGrid;
input statecode : $2. row column@@;
row=-row;       * Flip map's y-coordinates (FL at bottom, ME at top!);
row=.85*row;    * Tighten up line spacing;
cards;
AK 0 0.5 ME 0 11.5        
VT 1 10 NH 1 11        
WA 2 1.5 MT 2 2.5 ND 2 3.5 MN 2 4.5 WI 2 5.5 MI 2 7.5 NY 2 9.5 MA 2 10.5 RI 2 11.5 
ID 3 2 WY 3 3 SD 3 4 IA 3 5 IL 3 6 IN 3 7 OH 3 8 PA 3 9 NJ 3 10 CT 3 11
OR 4 1.5 NV 4 2.5 CO 4 3.5 NE 4 4.5 MO 4 5.5 KY 4 6.5 WV 4 7.5 MD 4 8.5 DE 4 9.5 
CA 5 2 AZ 5 3 UT 5 4 KS 5 5 AR 5 6 TN 5 7 VA 5 8 NC 5 9 DC 5 12 
NM 6 3.5 OK 6 4.5 LA 6 5.5 MS 6 6.5 AL 6 7.5 SC 6 8.5    
TX 7 4 GA 7 8        
HI 8 0.5 FL 8 8.5 
;
*==> Calc x, y points of hexagons for each state
     (stackoverflow.com/questions/3436453/calculate-coordinates-of-a-regular-polygons-vertices);

data StatePolygons;
set StateGrid;
radius=.54;
id+1;
angle = 2 * constant("PI") / 6;
do i = 0 to 6;
  x = column + radius * sin(i * angle);
  y = row + radius * cos(i * angle);
  output;
end;

*==> Merge map state hexagon coordinates and employer info, categorize as Walmart/Amazon/Other;

proc sql;
create table statesEmpl as 
select sp.*, employer, employees, 
       trim(employer)||' '||strip(put(employees,comma7.)) as employerEmployees, 
       case when employer='Walmart' then 2 when employer='Amazon' then 1 else 0 end as walmartamznother 
from StatePolygons sp, statescodes sc where sp.statecode=sc.statecode
order by sp.id, sp.i;

*==> Draw the Map - Polygon and Text Plots;

ods graphics / antialias width=14in height=9.5in;
proc template;
define statgraph ustemplate;
  begingraph;
    layout overlayequated / border=false xaxisopts=(display=none) yaxisopts=(display=none) walldisplay=none;
      entry "The Largest Employer in Every State" / valign=top textattrs=(color=black size=24pt);
      polygonplot id=id x=x y=y / colorresponse=walmartamznother colormodel=(red green blue) fillattrs=(transparency=.5) dataskin=matte display=(FILL);
      textplot x=column y=row text=statecode / textattrs=(color=white size=32pt weight=bold);
      textplot x=column y=row text=employerEmployees / textattrs=(color=black size=8pt) splitchar=' ' splitpolicy=split;
      entry 'Source: "The Largest Employer in Every State", 24/7 Wall St., April 2021' / valign=bottom textattrs=(color=black);
    endlayout;
  endgraph;
end;
run;

proc sgrender data=statesEmpl template=ustemplate;
run;
4 REPLIES 4
himself
Quartz | Level 8
Wow love this, just curious on the development of the code, like how did you start?
tc
Lapis Lazuli | Level 10 tc
Lapis Lazuli | Level 10

Like many, I suppose, I get ideas from time misspent surfing the web, co-workers, and an occasional book. 😀 I recalled seeing more traditional maps people created in past years from the 24/7 Wall Street data, so thought I'd try to create a hexagon map for Labor Day using updated data. Here, it helped that I had some existing data scraping and hexagon map SAS code I could stitch together. Always easier to copy code than create it from scratch! 😀

Rick_SAS
SAS Super FREQ

Interesting. Thanks. It looks like the largest employer in many of the "other states" are either healthcare or education. It would be nice to use different colors for those groups.

tc
Lapis Lazuli | Level 10 tc
Lapis Lazuli | Level 10

LaborDay

 

Good idea! Combined this with another suggestion to provide a larger image (note: big image seems to be downscaled for SAS Communities - original file here).

 

*  Fun With SAS ODS Graphics, Labor Day Edition: The Largest Employer in Every State
   Data source: The Largest Employer in Every State 247wallst.com/special-report/2021/04/08/the-largest-employer-in-every-state-3/;

*==> Scrape state info from HTML for pages 2-12 of 24/7 Wall St. story by Grant Suneson;
  
filename in url 'https://247wallst.com/special-report/2021/04/08/the-largest-employer-in-every-state-3/' lrecl=1000;
data states;
length fvi state employer employeesx industry $ 255;
do pg=2 to 12;                                                    * Data starts on page 2 of 12;
  fvi='https://247wallst.com/special-report/2021/04/08/the-largest-employer-in-every-state-3/'||compress(put(pg,2.))||'/';
  do while(not eof); 
    infile in url filevar=fvi end=eof;                            * Use file variable for URL (vary page from 2 to 12);
    input;                                                        * Grab state name?;
    i=index(_infile_,'<p><span class="title-bullet"><strong>');   
    if i>0 then do; 
      state=strip(scan(substr(_infile_,39),1,'<'));
      input;                                                      * Grab largest employer;
      i=index(_infile_,'<span class="value-bullet"><strong>> Largest employer:</strong>');
      employer=strip(scan(substr(_infile_,64),1,'<'));
      input;                                                      * Grab # employees;
      i=index(_infile_,'<span class="value-bullet"><strong>> Employees:</strong>');
      employeesX=strip(scan(substr(_infile_,57),1,'<'));
      employees=input(compress(employeesX,' ,'),15.);
      input;                                                      * Grab industry;
      i=index(_infile_,'<span class="value-bullet"><strong>> Industry:</strong>');
      industry=strip(scan(substr(_infile_,56),1,'<')); 
      output;
      end;
  end;
  eof=0;                                                          * Reset EOF for next file;
end;
stop;                                                             * Stop after pages read;
run;
 
proc sql;                                                         * Find 2-character state abbreviations;
create table statescodes as 
select z.statecode, s.* from states s 
left join (select distinct statename, statecode  from sashelp.zipcode) z on upcase(s.state)=upcase(z.statename);

*==> Create a US State Map of Hexagons (United Polygons of America!)
     US tile map coordinates from Matt Chambers drive.google.com/file/d/0B24QRdfcy_JJZFRKLUZzMHZYbG8/view;
     
data StateGrid;
input statecode : $2. row column@@;
row=-row;       * Flip map's y-coordinates (FL at bottom, ME at top!);
row=.85*row;    * Tighten up line spacing;
cards;
AK 0 0.5 ME 0 11.5        
VT 1 10 NH 1 11        
WA 2 1.5 MT 2 2.5 ND 2 3.5 MN 2 4.5 WI 2 5.5 MI 2 7.5 NY 2 9.5 MA 2 10.5 RI 2 11.5 
ID 3 2 WY 3 3 SD 3 4 IA 3 5 IL 3 6 IN 3 7 OH 3 8 PA 3 9 NJ 3 10 CT 3 11
OR 4 1.5 NV 4 2.5 CO 4 3.5 NE 4 4.5 MO 4 5.5 KY 4 6.5 WV 4 7.5 MD 4 8.5 DE 4 9.5 
CA 5 2 AZ 5 3 UT 5 4 KS 5 5 AR 5 6 TN 5 7 VA 5 8 NC 5 9 DC 5 12 
NM 6 3.5 OK 6 4.5 LA 6 5.5 MS 6 6.5 AL 6 7.5 SC 6 8.5    
TX 7 4 GA 7 8        
HI 8 0.5 FL 8 8.5 
;
*==> Calc x, y points of hexagons for each state
     (stackoverflow.com/questions/3436453/calculate-coordinates-of-a-regular-polygons-vertices);

data StatePolygons;
set StateGrid;
radius=.54;
id+1;
angle = 2 * constant("PI") / 6;
do i = 0 to 6;
  x = column + radius * sin(i * angle);
  y = row + radius * cos(i * angle);
  output;
end;

*==> Merge map state hexagon coordinates and employer info, categorize as Walmart/Amazon/Other;

proc sql;
create table statesEmpl as 
select sp.*, employer, employees, 
       trim(tranwrd(translate(employer,' ','-'),'New York City','NYC'))||' '||strip(put(employees,comma7.)) as employerEmployees, 
       case when employer='Walmart' then 4 when employer='Amazon' then 3  
            when industry='Education' then 2 when industry='Health care' then 1 else 0 end as employerColor 
from StatePolygons sp, statescodes sc where sp.statecode=sc.statecode
order by sp.id, sp.i;

*==> Draw the Map - Polygon and Text Plots;

ods listing image_dpi=200 gpath='~/';
ods graphics / reset antialias width=26in height=18in imagename='LaborDay';
proc template;
define statgraph ustemplate;
  begingraph;
    layout overlayequated / border=false xaxisopts=(display=none) yaxisopts=(display=none) walldisplay=none;
      entry "The Largest Employer in Every State" / valign=top textattrs=(color=black size=48pt);
      polygonplot id=id x=x y=y / colorresponse=employerColor colormodel=(red deeppink green indigo blue) fillattrs=(transparency=.5) dataskin=matte display=(FILL);
      textplot x=column y=row text=statecode / textattrs=(color=white size=52pt weight=bold);
      textplot x=column y=row text=employerEmployees / textattrs=(color=black size=14pt) splitchar=' ' splitpolicy=split;
      entry 'Source: "The Largest Employer in Every State", 24/7 Wall St., April 2021' / valign=bottom textattrs=(size=16pt color=black);
    endlayout;
  endgraph;
end;
run;

proc sgrender data=statesEmpl template=ustemplate;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1095 views
  • 9 likes
  • 3 in conversation