Hello,
I was looking at the distribution of CEO salaries in my histogram:
*Histogram;
title "Figure 1. CEO Salary";
title2 "2010-2017";
PROC UNIVARIATE data=paper.ceo_firm;
var salary;
histogram salary/kernel;
label salary="Salary (thousands)";
run;
and noticed that there appear to be negative values, which obviously doesn't make much sense. To investigate further, I used:
*find min;
proc sql;
select * from paper.ceo_firm
having salary = min(salary);
quit;
But found that the minimum salary is 0. Why would this be?
I will attach a screenshot of the histogram and data, if you don't mind.
Any ideas? Thanks!
your proc sql does not seem right. it is missing group by statement.
What do you expect the BY statement would be? I just want to see the lowest salaries in my data set.
by coname
I tried the following, but it never stopped running.
*find min;
proc sql;
select * from paper.ceo_firm
group by coname
having salary = min(salary);
quit;
if you do not want all other columns, just do
proc sql;
select coname, min(salary) as salary from paper.ceo_firm
group by coname
quit
@sastuck wrote:
I will attach a screenshot of the histogram and data, if you don't mind.
Any ideas? Thanks!
The histogram is NOT showing values less than 0. It is showing an axis that goes less than zero, which you can change. The zero values are in the leftmost bar, which straddles 0.
Oh, problem solved then. I'll accept your post as a solution. However, would you mind showing me how to truncate the x axis so that it stops at 0?
I don't believe you have that kind of control over the x-axis levels from PROC UNIVARIATE. You can however control the x-axis of a bar chart using either PROC GCHART or PROC SGPLOT.
@sastuck wrote:
Hello,
I was looking at the distribution of CEO salaries in my histogram:
*Histogram; title "Figure 1. CEO Salary"; title2 "2010-2017"; PROC UNIVARIATE data=paper.ceo_firm; var salary; histogram salary/kernel; label salary="Salary (thousands)"; run;
and noticed that there appear to be negative values, which obviously doesn't make much sense. To investigate further, I used:
*find min; proc sql; select * from paper.ceo_firm having salary = min(salary); quit;
But found that the minimum salary is 0. Why would this be?
I will attach a screenshot of the histogram and data, if you don't mind.
I believe we already had one brief discussion about values of zero in your salary data. Why they are there remains a mystery to me.
Proc Univariate should also have produced a table with the extremes. The GRAPH generates interval tickmarks based on values of your data. Why it chose to have a -250 relates to the default algorithm used to distribute tick marks for the type graph displayed.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.