The Largest Employer in Every State
Leveraged some old code and scraped some data from a 24/7 Wall St. story that ran earlier this year to cobble together a hexagon tile map of the largest employers in each state. Happy Labor Day, all!
* Fun With SAS ODS Graphics, Labor Day Edition: The Largest Employer in Every State
Data source: The Largest Employer in Every State 247wallst.com/special-report/2021/04/08/the-largest-employer-in-every-state-3/;
*==> Scrape state info from HTML for pages 2-12 of 24/7 Wall St. story by Grant Suneson;
filename in url 'https://247wallst.com/special-report/2021/04/08/the-largest-employer-in-every-state-3/' lrecl=1000;
data states;
length fvi state employer employeesx industry $ 255;
do pg=2 to 12; * Data starts on page 2 of 12;
fvi='https://247wallst.com/special-report/2021/04/08/the-largest-employer-in-every-state-3/'||compress(put(pg,2.))||'/';
do while(not eof);
infile in url filevar=fvi end=eof; * Use file variable for URL (vary page from 2 to 12);
input; * Grab state name?;
i=index(_infile_,'<p><span class="title-bullet"><strong>');
if i>0 then do;
state=strip(scan(substr(_infile_,39),1,'<'));
input; * Grab largest employer;
i=index(_infile_,'<span class="value-bullet"><strong>> Largest employer:</strong>');
employer=strip(scan(substr(_infile_,64),1,'<'));
input; * Grab # employees;
i=index(_infile_,'<span class="value-bullet"><strong>> Employees:</strong>');
employeesX=strip(scan(substr(_infile_,57),1,'<'));
employees=input(compress(employeesX,' ,'),15.);
input; * Grab industry;
i=index(_infile_,'<span class="value-bullet"><strong>> Industry:</strong>');
industry=strip(scan(substr(_infile_,56),1,'<'));
output;
end;
end;
eof=0; * Reset EOF for next file;
end;
stop; * Stop after pages read;
run;
proc sql; * Find 2-character state abbreviations;
create table statescodes as
select z.statecode, s.* from states s
left join (select distinct statename, statecode from sashelp.zipcode) z on upcase(s.state)=upcase(z.statename);
*==> Create a US State Map of Hexagons (United Polygons of America!)
US tile map coordinates from Matt Chambers drive.google.com/file/d/0B24QRdfcy_JJZFRKLUZzMHZYbG8/view;
data StateGrid;
input statecode : $2. row column@@;
row=-row; * Flip map's y-coordinates (FL at bottom, ME at top!);
row=.85*row; * Tighten up line spacing;
cards;
AK 0 0.5 ME 0 11.5
VT 1 10 NH 1 11
WA 2 1.5 MT 2 2.5 ND 2 3.5 MN 2 4.5 WI 2 5.5 MI 2 7.5 NY 2 9.5 MA 2 10.5 RI 2 11.5
ID 3 2 WY 3 3 SD 3 4 IA 3 5 IL 3 6 IN 3 7 OH 3 8 PA 3 9 NJ 3 10 CT 3 11
OR 4 1.5 NV 4 2.5 CO 4 3.5 NE 4 4.5 MO 4 5.5 KY 4 6.5 WV 4 7.5 MD 4 8.5 DE 4 9.5
CA 5 2 AZ 5 3 UT 5 4 KS 5 5 AR 5 6 TN 5 7 VA 5 8 NC 5 9 DC 5 12
NM 6 3.5 OK 6 4.5 LA 6 5.5 MS 6 6.5 AL 6 7.5 SC 6 8.5
TX 7 4 GA 7 8
HI 8 0.5 FL 8 8.5
;
*==> Calc x, y points of hexagons for each state
(stackoverflow.com/questions/3436453/calculate-coordinates-of-a-regular-polygons-vertices);
data StatePolygons;
set StateGrid;
radius=.54;
id+1;
angle = 2 * constant("PI") / 6;
do i = 0 to 6;
x = column + radius * sin(i * angle);
y = row + radius * cos(i * angle);
output;
end;
*==> Merge map state hexagon coordinates and employer info, categorize as Walmart/Amazon/Other;
proc sql;
create table statesEmpl as
select sp.*, employer, employees,
trim(employer)||' '||strip(put(employees,comma7.)) as employerEmployees,
case when employer='Walmart' then 2 when employer='Amazon' then 1 else 0 end as walmartamznother
from StatePolygons sp, statescodes sc where sp.statecode=sc.statecode
order by sp.id, sp.i;
*==> Draw the Map - Polygon and Text Plots;
ods graphics / antialias width=14in height=9.5in;
proc template;
define statgraph ustemplate;
begingraph;
layout overlayequated / border=false xaxisopts=(display=none) yaxisopts=(display=none) walldisplay=none;
entry "The Largest Employer in Every State" / valign=top textattrs=(color=black size=24pt);
polygonplot id=id x=x y=y / colorresponse=walmartamznother colormodel=(red green blue) fillattrs=(transparency=.5) dataskin=matte display=(FILL);
textplot x=column y=row text=statecode / textattrs=(color=white size=32pt weight=bold);
textplot x=column y=row text=employerEmployees / textattrs=(color=black size=8pt) splitchar=' ' splitpolicy=split;
entry 'Source: "The Largest Employer in Every State", 24/7 Wall St., April 2021' / valign=bottom textattrs=(color=black);
endlayout;
endgraph;
end;
run;
proc sgrender data=statesEmpl template=ustemplate;
run;
Like many, I suppose, I get ideas from time misspent surfing the web, co-workers, and an occasional book. 😀 I recalled seeing more traditional maps people created in past years from the 24/7 Wall Street data, so thought I'd try to create a hexagon map for Labor Day using updated data. Here, it helped that I had some existing data scraping and hexagon map SAS code I could stitch together. Always easier to copy code than create it from scratch! 😀
Interesting. Thanks. It looks like the largest employer in many of the "other states" are either healthcare or education. It would be nice to use different colors for those groups.
Good idea! Combined this with another suggestion to provide a larger image (note: big image seems to be downscaled for SAS Communities - original file here).
* Fun With SAS ODS Graphics, Labor Day Edition: The Largest Employer in Every State
Data source: The Largest Employer in Every State 247wallst.com/special-report/2021/04/08/the-largest-employer-in-every-state-3/;
*==> Scrape state info from HTML for pages 2-12 of 24/7 Wall St. story by Grant Suneson;
filename in url 'https://247wallst.com/special-report/2021/04/08/the-largest-employer-in-every-state-3/' lrecl=1000;
data states;
length fvi state employer employeesx industry $ 255;
do pg=2 to 12; * Data starts on page 2 of 12;
fvi='https://247wallst.com/special-report/2021/04/08/the-largest-employer-in-every-state-3/'||compress(put(pg,2.))||'/';
do while(not eof);
infile in url filevar=fvi end=eof; * Use file variable for URL (vary page from 2 to 12);
input; * Grab state name?;
i=index(_infile_,'<p><span class="title-bullet"><strong>');
if i>0 then do;
state=strip(scan(substr(_infile_,39),1,'<'));
input; * Grab largest employer;
i=index(_infile_,'<span class="value-bullet"><strong>> Largest employer:</strong>');
employer=strip(scan(substr(_infile_,64),1,'<'));
input; * Grab # employees;
i=index(_infile_,'<span class="value-bullet"><strong>> Employees:</strong>');
employeesX=strip(scan(substr(_infile_,57),1,'<'));
employees=input(compress(employeesX,' ,'),15.);
input; * Grab industry;
i=index(_infile_,'<span class="value-bullet"><strong>> Industry:</strong>');
industry=strip(scan(substr(_infile_,56),1,'<'));
output;
end;
end;
eof=0; * Reset EOF for next file;
end;
stop; * Stop after pages read;
run;
proc sql; * Find 2-character state abbreviations;
create table statescodes as
select z.statecode, s.* from states s
left join (select distinct statename, statecode from sashelp.zipcode) z on upcase(s.state)=upcase(z.statename);
*==> Create a US State Map of Hexagons (United Polygons of America!)
US tile map coordinates from Matt Chambers drive.google.com/file/d/0B24QRdfcy_JJZFRKLUZzMHZYbG8/view;
data StateGrid;
input statecode : $2. row column@@;
row=-row; * Flip map's y-coordinates (FL at bottom, ME at top!);
row=.85*row; * Tighten up line spacing;
cards;
AK 0 0.5 ME 0 11.5
VT 1 10 NH 1 11
WA 2 1.5 MT 2 2.5 ND 2 3.5 MN 2 4.5 WI 2 5.5 MI 2 7.5 NY 2 9.5 MA 2 10.5 RI 2 11.5
ID 3 2 WY 3 3 SD 3 4 IA 3 5 IL 3 6 IN 3 7 OH 3 8 PA 3 9 NJ 3 10 CT 3 11
OR 4 1.5 NV 4 2.5 CO 4 3.5 NE 4 4.5 MO 4 5.5 KY 4 6.5 WV 4 7.5 MD 4 8.5 DE 4 9.5
CA 5 2 AZ 5 3 UT 5 4 KS 5 5 AR 5 6 TN 5 7 VA 5 8 NC 5 9 DC 5 12
NM 6 3.5 OK 6 4.5 LA 6 5.5 MS 6 6.5 AL 6 7.5 SC 6 8.5
TX 7 4 GA 7 8
HI 8 0.5 FL 8 8.5
;
*==> Calc x, y points of hexagons for each state
(stackoverflow.com/questions/3436453/calculate-coordinates-of-a-regular-polygons-vertices);
data StatePolygons;
set StateGrid;
radius=.54;
id+1;
angle = 2 * constant("PI") / 6;
do i = 0 to 6;
x = column + radius * sin(i * angle);
y = row + radius * cos(i * angle);
output;
end;
*==> Merge map state hexagon coordinates and employer info, categorize as Walmart/Amazon/Other;
proc sql;
create table statesEmpl as
select sp.*, employer, employees,
trim(tranwrd(translate(employer,' ','-'),'New York City','NYC'))||' '||strip(put(employees,comma7.)) as employerEmployees,
case when employer='Walmart' then 4 when employer='Amazon' then 3
when industry='Education' then 2 when industry='Health care' then 1 else 0 end as employerColor
from StatePolygons sp, statescodes sc where sp.statecode=sc.statecode
order by sp.id, sp.i;
*==> Draw the Map - Polygon and Text Plots;
ods listing image_dpi=200 gpath='~/';
ods graphics / reset antialias width=26in height=18in imagename='LaborDay';
proc template;
define statgraph ustemplate;
begingraph;
layout overlayequated / border=false xaxisopts=(display=none) yaxisopts=(display=none) walldisplay=none;
entry "The Largest Employer in Every State" / valign=top textattrs=(color=black size=48pt);
polygonplot id=id x=x y=y / colorresponse=employerColor colormodel=(red deeppink green indigo blue) fillattrs=(transparency=.5) dataskin=matte display=(FILL);
textplot x=column y=row text=statecode / textattrs=(color=white size=52pt weight=bold);
textplot x=column y=row text=employerEmployees / textattrs=(color=black size=14pt) splitchar=' ' splitpolicy=split;
entry 'Source: "The Largest Employer in Every State", 24/7 Wall St., April 2021' / valign=bottom textattrs=(size=16pt color=black);
endlayout;
endgraph;
end;
run;
proc sgrender data=statesEmpl template=ustemplate;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.