My workplace has recently switched over to SAS from SPSS. I am really confused about the utility of all the packages in SAS. We need to be able to do quite a bit of data manipulation, calculations (with constants), produce (pretty, which seems to be a problem in SAS) graphs and tables from our data sets as well as adding in comparative data into the chart/graph that is not in the data set.
One feature I loved in SPSS was the abilty to paste syntax, is this available in SAS enterprise guide? What about enterprise reporter or jmp?
Please help, I am so confused...
SAS is like a Swiss Army Knife. However, SAS is not just one product, but a collection of "products" or modules:
--Base SAS (sometimes called Foundation SAS)
--SAS/STAT (the collection of statistical procedures)
--SAS/Graph (the collection of graphic procedures)
--SAS/ETS (the collection of Econometric Time Series procedures)
--SAS/QC (the collection of Quality Control procedures)
--SAS/SHARE (the utility collection for sharing SAS files)
--SAS/ACCESS (a collection of "access" engines for connecting to different file sources, such as Adabas, Oracle, Sybase, DB2, Excel, IDMS, etc)
--there are many more, but these list the "biggies"
Then, there are some SAS-related applications
--Enterprise Guide is a GUI front end to SAS (SAS either sits on a local machine or on a server machine)
--JMP --a stand-alone product, that does visual data analysis and statistics (data are stored in JMP format datasets, but you can also read SAS dataset with JMP).
--Enterprise Reporter is a component of the Data Warehousing/Warehouse Administrator Solution
--etc, etc (many more "applications" like the Anti-Money Laundering Solution, Customer Relationship Management Solution, SAS Data Integration Studio...)
You can buy MANY different kinds of Swiss Army knives: one with a knife, tweezers and can opener; another with a knife, can opener and phillips screwdriver; a third with a knife, tweezers, toothpick and scissors; a fourth with a knife, scissors, can opener and screwdriver; etc, etc.
Well, you can get a license for MANY different assemblages of SAS products (most of the "solutions" have Base SAS included in the license):
--Base SAS only
--Base SAS and SAS/Stat
--Base SAS and SAS/QC
--Base SAS and SAS/Stat and SAS/Graph
--EG and Base SAS
--EG as part of the BI Platform with the whole client-server environment
--Data Warehouse Solution, including Warehouse Administrator and Enterprise Reporter
--BI Platform with Add-in for Microsoft Office, EG, Web Report Studio
--BI Platform with Add-in for Microsoft Office, EG, Web Report Studio, OLAP Cube Studio
So, let's not focus on product packages, but instead focus on what you say you need to do:
--Data Manipulation: Use PROC SQL; Use DATA step program;
--Calculations (with constants): Use PROC SQL; Use DATA step program; you can also perform calculations in PROC REPORT.
--produce graphs and tables: Use SAS/Graph for the graphs and other procedures for the tables (TABULATE, REPORT, PRINT, MEANS, FREQ, REG, GLM, etc, etc)
--add information into the chart/graph: To add information to a graph, you would use the ANNOTATE capability of SAS/Graph
--add information into a table: how you would add the information into a table would depend on your procedure of choice
--paste syntax: Almost all of the GUI front ends (SAS/Assist, Enterprise Guide, SAS Data Integration Studio) will allow you to capture and save syntax statements to accomplish a task. Most allow you to save the syntax as a .SAS file -- then you would cut and paste or modify the .SAS file to continue working.
--paste syntax in EG: In EG, you open a code window and you can cut and paste syntax into the code window.
--paste syntax in JMP: I am not sure about the ability to cut and paste syntax in JMP.
Since all of the solutions have some form of Base SAS included, then, let's talk about how Base SAS operates. There are 2 main components to Base SAS -- the "canned" procedures and the programming language facility. These are referred to as the SAS PROCs (short for procedures) and DATA Step (for the way you start a program).
SAS PROCS read -in- SAS data sets (a proprietary data storage format); perform some kind of processing (will differ based on the procedure; produce output tables (and/or graphs).
For example, in the SASHELP library, there is a dataset called SASHELP.SHOES. I can run these "canned" procedures on the dataset:[pre]
proc freq data=SASHELP.SHOES;
tables region product;
title 'Region and Product Information';
proc means data=SASHELP.SHOES maxdec=2 sum mean;
title 'Total and Average Sales for Each Region';
The first procedure will give me a list of each region and the count of rows (observations) for each region in the data set SASHELP.SHOES; and the second procedure will calculate summary statistics (sum and mean) for the SALES column (variable) for each region. I've just shown you the code way to run these 2 procedures.
But, if you were using Enterprise Guide, you could perform these same tasks by using point and click selections. Behind the scenes, EG would be building this same code -- which you could export, look at, cut and paste, use in other projects.
But, these are "canned" procedures. Neither PROC FREQ nor PROC MEANS would let you calculate a new column. Let's say that for some reason, you want to assume that shoe sales will increase by 25%-50% because of the new bouncy rubber soles that you are going to introduce. So to calculate projected sales, you need to multiply the current sales column/variable by 1.25 and 1.50 in order to come up with the projected numbers. Essentially, you want to create a copy of the input dataset and add 2 new calculated columns. There are not many procedures that would allow you to create new columns or variables. So, you either have to turn to PROC SQL to calculate your new variable or use a DATA step program. Here's what the code would look like in SQL procedure and in the programming language:[pre]
create table newshoesSQ as
sales * 1.25 as sales_at_25 ,
sales * 1.50 as sales_at_50
order by region;
sales_at_25 = sales * 1.25;
sales_at_50 = sales * 1.50;
Both of the above steps are doing the same thing, just in different ways. Both are reading in a SAS data set ("from sashelp.shoes" vs "set sashelp.shoes;" ); both are creating a NEW SAS dataset ("create table newshoesSQ" vs "data newshoesDT"); both are calculating a new variable/column (SQL uses the AS vs the Data step using an "assignment statement").
There are some differences between the 2 steps. The SQL procedure contains one long statement that ends with just 1 semi-colon, after "order by region;", while the data step program is composed of separate statements and each statement ends with a semi-colon.
If you were using EG, you could run a query that created a new column or columns using point and click selections and EG would build the PROC SQL code behind the scenes. If you preferred the Data step approach, then you could write your code in an EG code node and submit it to build you data set with the projected numbers.
If all the code looked pretty incomprehensible to you, then you might be better off starting with JMP or EG to do your data manipulation, data analysis and data presentation. If, on the other hand, the code didn't look that bad, then you might start out with learning the code.