Hello SAS community,
I am working on my descriptive statistics section of a term paper (undergrad) and I am looking for different creative ways to look at my data and learn more about it using proc sgplot, and proc univariate, and whatever other tools SAS offers. (So far I have summary statistics, a few histograms, and a scatterplot. Nothing too special.) One thing specifically I would like to do is be able to see an outlier in, say, a scatter plot, and then go to the dataset and find that observation. The only thing is, I don't want to manipulate the data itself for fear of accidentally altering it in some way. Is there a way to find these max or min values and view the data itself? All in all, I just want to get to know my data better and have a more thorough descriptive statistics section. Any help is appreciated. In case this is helpful, one of my data sets is firm performance (annual return, volume, etc.) and the other is CEO performance (salary, bonus, stock options, etc.) I am estimating the effect of changes in firm performance on CEO salary.
Thanks for your time!
-SAStuck
PROC UNIVARIATE with and ID statement should pretty well for a quick list.
proc univariate data=sashelp.class ;
id name ;
var height ;
run;
The UNIVARIATE Procedure
Variable: Height
Extreme Observations
----------Lowest--------- ---------Highest---------
Value Name Obs Value Name Obs
51.3 Joyce 11 66.5 Mary 14
56.3 Louise 13 66.5 William 19
56.5 Alice 2 67.0 Ronald 17
57.3 James 6 69.0 Alfred 1
57.5 Thomas 18 72.0 Philip 15
Finding a max is easy in proc sql:
proc sql;
select * from company
having volume = max(volume);
quit;
Hey,
Thanks for the help!
What am I missing here?
*find max;
proc sql data=paper.ceo_firm2;
select * from ticker
having vol = max(vol);
quit;here's the log:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
70
71 proc sql data=paper.ceo_firm2;
____
22
76
ERROR 22-322: Syntax error, expecting one of the following: ;, CHECK, CODEGEN, CONSTDATETIME, DICTDIAG, DOUBLE, DQUOTE, ERRORSTOP,
EXEC, EXITCODE, FEEDBACK, FLOW, INOBS, IPASSTHRU, LOOPS, NOCHECK, NOCODEGEN, NOCONSTDATETIME, NODICTDIAG, NODOUBLE,
NOERRORSTOP, NOEXEC, NOFEEDBACK, NOFLOW, NOIPASSTHRU, NONUMBER, NOPRINT, NOPROMPT, NOREMERGE, NOSORTMSG, NOSTIMER,
NOSTOPONTRUNC, NOTHREADS, NOWARNRECURS, NUMBER, OUTOBS, PRINT, PROMPT, REDUCEPUT, REDUCEPUTOBS, REDUCEPUTVALUES,
REMERGE, SORTMSG, SORTSEQ, STIMER, STOPONTRUNC, THREADS, UBUFSIZE, UNDO_POLICY, WARNRECURS.
ERROR 76-322: Syntax error, statement will be ignored.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
72 select * from ticker
73 having vol = max(vol);
ERROR: File WORK.TICKER.DATA does not exist.
74 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 57.43k
OS Memory 26276.00k
Timestamp 04/18/2018 01:14:05 PM
Step Count 29 Switch Count 0
Page Faults 0
Page Reclaims 70
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 8
75
76 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
88 Thanks!
If you want to create a new dataset with proc sql, you need the create table statement:
proc sql;
create table paper.ceo_firm2 as
select * from ticker
having vol = max(vol);
quit;
What if I don't want to create a new table but just want to look at the table I already have? I only included that
paper.ceo_firm2because I assumed I needed to give SAS a place to look for the data I wanted to look at. Because I don't see the need to create a new table for this purpose.
Then use your existing dataset in the "from" part of the SQL select, and view the result in the Results window. And omit the create table .. as, of course.
Thanks! This is the code that worked:
*find max;
proc sql;
select * from paper.ceo_firm2
having vol = max(vol);
quit;
Welcome to the wonderful world of SAS SQL!
A google search for "sas sql summary functions" will reveal lots of other useful functions that can be used along with max().
PROC UNIVARIATE with and ID statement should pretty well for a quick list.
proc univariate data=sashelp.class ;
id name ;
var height ;
run;
The UNIVARIATE Procedure
Variable: Height
Extreme Observations
----------Lowest--------- ---------Highest---------
Value Name Obs Value Name Obs
51.3 Joyce 11 66.5 Mary 14
56.3 Louise 13 66.5 William 19
56.5 Alice 2 67.0 Ronald 17
57.3 James 6 69.0 Alfred 1
57.5 Thomas 18 72.0 Philip 15
Wow, this works great! Thanks!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Get started using SAS Studio to write, run and debug your SAS programs.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.