IMAGE (reduced to 25% size of original, resulting in loss of resolution)
DESCRIPTION
Whether you use data for good like Jake Porway, or you're just a nosy-busybody like Gladys Kravitz, Open Data is likely to pique your interest.
Take the City of Chicago Salary Data, for example. Read it into SAS, and you can remix new and old visualization tips from Graphically Speaking bloggers Prashant Hebbbar and Sanjay Matange to start gaining some insight into the data.
For example, crime may not pay, but it certainly costs. Big time. And while a cop's salary might look attractive at first glance, you can make even better coin as an IT staffer - without putting your life in jeopardy. And last, but not least, you're much better off financially being the Aviation Chief than the Mayor!
SAS CODE
filename chicago url 'http://data.cityofchicago.org/resource/tt4n-kn4t.csv';
data chicago; * Read salary data;
infile chicago dsd dlm=',' lrecl=2000 truncover firstobs=2;
input Department : $30. Salary Title : $30. Name : $30.;
format salary dollar8.;
proc sql; * Highest paid in each department?;
create table maxavgsalary as
select department, max(salary) as maxsalary, avg(salary) as avgsalary
from chicago group by 1;
create table maxavgsalaryname as
select c.department, max(maxsalary) as maxsalary, max(avgsalary) as avgsalary,
max(name) as maxname
from chicago c, maxavgsalary m
where c.department=m.department and c.salary=m.maxsalary group by 1;
create table chicagomaxsalary as
select c.department, c.salary, 1 as Employees format=comma6., m.avgsalary,
m.maxname, m.maxsalary format=dollar8.
from chicago c, maxavgsalaryname m
where c.department=m.department and c.department<>""
order by 4 desc;
* Create plot;
ods graphics on / reset imagename="ChicagoSalaries" border=off
width=14in height=8.5in antialias ANTIALIASMAX=32100 ;
ods listing gpath='/folders/myfolders' image_dpi=300;
proc sgplot data=chicagomaxsalary noautolegend;
title "City of Chicago: Annual Salary Distribution by Department";
scatter y=department x=salary;
hbox Salary / category=Department noFill noOutliers meanattrs=(color=red symbol=diamondfilled);
xaxis grid display=(nolabel);
yaxis discreteorder=data display=(nolabel) max=300000;
yaxistable employees / stat=sum location=inside position=left label="#";
yaxistable salary / stat=mean location=inside position=left label="Avg Salary";
yaxistable maxname / location=inside position=right label="Highest Paid";
yaxistable maxsalary / stat=mean location=inside position=right label="$";
run;
OT OPPORTUNITIES
Btw, some departments have opportunities for Overtime and Supplemental Earnings
CODE
filename chicago url 'http://data.cityofchicago.org/resource/wvhw-8n2j.csv';
data chicago; * Read OT/Supplemental $;
infile chicago dsd dlm=',' lrecl=2000 truncover firstobs=2;
input aprOT augOT decOT Department : $40. Name : $40. febOT janOT julOT junOT
marOT mayOT novOT octOT sepOT Title : $40. Overtime;
format Overtime dollar8.;
proc sql; * Highest paid OT in each department?;
create table maxavgovertime as
select department, max(overtime) as maxovertime, avg(overtime) as avgovertime
from chicago group by 1;
create table maxavgovertimename as
select c.department, max(maxovertime) as maxovertime, max(avgovertime) as avgovertime,
max(title) as maxtitle
from chicago c, maxavgovertime m
where c.department=m.department and c.overtime=m.maxovertime group by 1;
create table chicagomaxovertime as
select c.department, c.overtime, 1 as Employees format=comma6., m.avgovertime,
m.maxtitle, m.maxovertime format=dollar8.
from chicago c, maxavgovertimename m
where c.department=m.department and c.department<>""
order by 4 desc;
* Create plot;
ods graphics on / reset imagename="ChicagoOT" border=off
width=14in height=8.5in antialias ANTIALIASMAX=32100 ;
ods listing gpath='/folders/myfolders' image_dpi=300;
proc sgplot data=chicagomaxovertime noautolegend;
title "City of Chicago: 2015 Overtime/Supplemental Pay by Department";
scatter y=department x=overtime;
hbox overtime / category=Department noFill noOutliers meanattrs=(color=red symbol=diamondfilled);
xaxis grid display=(nolabel);
yaxis discreteorder=data display=(nolabel);
yaxistable employees / stat=sum location=inside position=left label="#";
yaxistable overtime / stat=mean location=inside position=left label="Avg $";
yaxistable overtime / stat=sum location=inside position=left label="Tot $";
yaxistable maxtitle/ location=inside position=right label="Max OT";
yaxistable maxovertime / stat=mean location=inside position=right label="$";
run;
.
Very nice. With SAS 9.4, you can now use the y-axis option SplitPolicy=split to avoid long y-axis tick values.
proc sgplot data=sashelp.heart;
hbar deathcause / response=systolic stat=mean;
yaxis fitpolicy=split;
run;
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.