Hello,
I was looking at the distribution of CEO salaries in my histogram:
*Histogram;
title "Figure 1. CEO Salary";
title2 "2010-2017";
PROC UNIVARIATE data=paper.ceo_firm;
var salary;
histogram salary/kernel;
label salary="Salary (thousands)";
run;
and noticed that there appear to be negative values, which obviously doesn't make much sense. To investigate further, I used:
*find min;
proc sql;
select * from paper.ceo_firm
having salary = min(salary);
quit;
But found that the minimum salary is 0. Why would this be?
I will attach a screenshot of the histogram and data, if you don't mind.
Any ideas? Thanks!
your proc sql does not seem right. it is missing group by statement.
What do you expect the BY statement would be? I just want to see the lowest salaries in my data set.
by coname
I tried the following, but it never stopped running.
*find min;
proc sql;
select * from paper.ceo_firm
group by coname
having salary = min(salary);
quit;
if you do not want all other columns, just do
proc sql;
select coname, min(salary) as salary from paper.ceo_firm
group by coname
quit
@sastuck wrote:
I will attach a screenshot of the histogram and data, if you don't mind.
Any ideas? Thanks!
The histogram is NOT showing values less than 0. It is showing an axis that goes less than zero, which you can change. The zero values are in the leftmost bar, which straddles 0.
Oh, problem solved then. I'll accept your post as a solution. However, would you mind showing me how to truncate the x axis so that it stops at 0?
I don't believe you have that kind of control over the x-axis levels from PROC UNIVARIATE. You can however control the x-axis of a bar chart using either PROC GCHART or PROC SGPLOT.
@sastuck wrote:
Hello,
I was looking at the distribution of CEO salaries in my histogram:
*Histogram; title "Figure 1. CEO Salary"; title2 "2010-2017"; PROC UNIVARIATE data=paper.ceo_firm; var salary; histogram salary/kernel; label salary="Salary (thousands)"; run;
and noticed that there appear to be negative values, which obviously doesn't make much sense. To investigate further, I used:
*find min; proc sql; select * from paper.ceo_firm having salary = min(salary); quit;
But found that the minimum salary is 0. Why would this be?
I will attach a screenshot of the histogram and data, if you don't mind.
I believe we already had one brief discussion about values of zero in your salary data. Why they are there remains a mystery to me.
Proc Univariate should also have produced a table with the extremes. The GRAPH generates interval tickmarks based on values of your data. Why it chose to have a -250 relates to the default algorithm used to distribute tick marks for the type graph displayed.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.