Hello SAS community,
I am working on my descriptive statistics section of a term paper (undergrad) and I am looking for different creative ways to look at my data and learn more about it using proc sgplot, and proc univariate, and whatever other tools SAS offers. (So far I have summary statistics, a few histograms, and a scatterplot. Nothing too special.) One thing specifically I would like to do is be able to see an outlier in, say, a scatter plot, and then go to the dataset and find that observation. The only thing is, I don't want to manipulate the data itself for fear of accidentally altering it in some way. Is there a way to find these max or min values and view the data itself? All in all, I just want to get to know my data better and have a more thorough descriptive statistics section. Any help is appreciated. In case this is helpful, one of my data sets is firm performance (annual return, volume, etc.) and the other is CEO performance (salary, bonus, stock options, etc.) I am estimating the effect of changes in firm performance on CEO salary.
Thanks for your time!
-SAStuck
PROC UNIVARIATE with and ID statement should pretty well for a quick list.
proc univariate data=sashelp.class ;
id name ;
var height ;
run;
The UNIVARIATE Procedure Variable: Height Extreme Observations ----------Lowest--------- ---------Highest--------- Value Name Obs Value Name Obs 51.3 Joyce 11 66.5 Mary 14 56.3 Louise 13 66.5 William 19 56.5 Alice 2 67.0 Ronald 17 57.3 James 6 69.0 Alfred 1 57.5 Thomas 18 72.0 Philip 15
Finding a max is easy in proc sql:
proc sql;
select * from company
having volume = max(volume);
quit;
Hey,
Thanks for the help!
What am I missing here?
*find max;
proc sql data=paper.ceo_firm2;
select * from ticker
having vol = max(vol);
quit;
here's the log:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 70 71 proc sql data=paper.ceo_firm2; ____ 22 76 ERROR 22-322: Syntax error, expecting one of the following: ;, CHECK, CODEGEN, CONSTDATETIME, DICTDIAG, DOUBLE, DQUOTE, ERRORSTOP, EXEC, EXITCODE, FEEDBACK, FLOW, INOBS, IPASSTHRU, LOOPS, NOCHECK, NOCODEGEN, NOCONSTDATETIME, NODICTDIAG, NODOUBLE, NOERRORSTOP, NOEXEC, NOFEEDBACK, NOFLOW, NOIPASSTHRU, NONUMBER, NOPRINT, NOPROMPT, NOREMERGE, NOSORTMSG, NOSTIMER, NOSTOPONTRUNC, NOTHREADS, NOWARNRECURS, NUMBER, OUTOBS, PRINT, PROMPT, REDUCEPUT, REDUCEPUTOBS, REDUCEPUTVALUES, REMERGE, SORTMSG, SORTSEQ, STIMER, STOPONTRUNC, THREADS, UBUFSIZE, UNDO_POLICY, WARNRECURS. ERROR 76-322: Syntax error, statement will be ignored. NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements. 72 select * from ticker 73 having vol = max(vol); ERROR: File WORK.TICKER.DATA does not exist. 74 quit; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE SQL used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 57.43k OS Memory 26276.00k Timestamp 04/18/2018 01:14:05 PM Step Count 29 Switch Count 0 Page Faults 0 Page Reclaims 70 Page Swaps 0 Voluntary Context Switches 0 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 8 75 76 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 88
Thanks!
If you want to create a new dataset with proc sql, you need the create table statement:
proc sql;
create table paper.ceo_firm2 as
select * from ticker
having vol = max(vol);
quit;
What if I don't want to create a new table but just want to look at the table I already have? I only included that
paper.ceo_firm2
because I assumed I needed to give SAS a place to look for the data I wanted to look at. Because I don't see the need to create a new table for this purpose.
Then use your existing dataset in the "from" part of the SQL select, and view the result in the Results window. And omit the create table .. as, of course.
Thanks! This is the code that worked:
*find max;
proc sql;
select * from paper.ceo_firm2
having vol = max(vol);
quit;
Welcome to the wonderful world of SAS SQL!
A google search for "sas sql summary functions" will reveal lots of other useful functions that can be used along with max().
PROC UNIVARIATE with and ID statement should pretty well for a quick list.
proc univariate data=sashelp.class ;
id name ;
var height ;
run;
The UNIVARIATE Procedure Variable: Height Extreme Observations ----------Lowest--------- ---------Highest--------- Value Name Obs Value Name Obs 51.3 Joyce 11 66.5 Mary 14 56.3 Louise 13 66.5 William 19 56.5 Alice 2 67.0 Ronald 17 57.3 James 6 69.0 Alfred 1 57.5 Thomas 18 72.0 Philip 15
Wow, this works great! Thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.