BookmarkSubscribeRSS Feed
Andy_R
Calcite | Level 5

Hi- I am JUST getting started in learning to use SAS OnDemand.  I am trying to understand why I should use SAS data procs to create subsets of my data... as someone who already knows how to use excel, doesn't it make more sense for me to just make csv files for each and every relationship I want to test, and then make a SAS data table for each one of those?  I'm not doing anything super complicated: just running two factor ANOVAs.  Or am I missing something in my understanding of how to use SAS?

 

Thanks very much,

 

Andy_R

5 REPLIES 5
ballardw
Super User

One of the very powerful tools in SAS is BY group processing; You can do the analysis for each of your subgroups with the data in one dataset. Sort the data by the grouping variable(s) and add a BY statement referencing those variables in the Anova code.

For instance suppose you are grouping data by State and County for your analyis. That would be creating one csv file for literally hundreds if not thousands of counties. In SAS: add

 

proc (whatever) data=dataset;

    by State County;

<analysis options>

;

run;

 

You get separate output for each existing combination of state and county.

 

I learned SAS before Excel existed and always feel crippled when trying to do anything in Excel, often because of just this behavior.

 

Andy_R
Calcite | Level 5

thanks for your response!

 

So you feel it makes more sense to convert the giant data set (had been entered in Excel) into one giant csv, import as one giant table, and then do procs on that set from within SAS using a BY statement for each separate analysis?

ballardw
Super User

Generally yes. If your data fits into a single Excel worksheet then it isn't considered "giant" by SAS. SAS can handle many more rows and columns than Excel.

 

A second thing that comes in handy are things like WHERE statements or dataset options to select records with specific characteristics.

Suppose after looking at the first analysis I realize that it included a bunch of products that weren't pertinent, or I wanted to concentrate on a selection of characterisitc. In many procedures you could add a statement like:

Where productcode in ('ProductId1' 'ProductId2' 'ProductId3') AND Sales> 5000;

 

To the previous code. Again generating Lots of potential output but reduced to the specified Products and where the variable Sales is larger than 5000 for each record.

Kurt_Bremser
Super User

Any data entered in Excel is at best "puny" from a SAS perspective.

Try entering 150 million rows with 2000 columns in Excel.

 

Subsetting/splitting data is much easier done in SAS than in Excel. And, since you have codes and logs, it is easier to document and easier to control.

 

Don't confuse a spreadsheet calculator with BI software.

 

Reeza
Super User
SAS should be able to import directly from the Excel file as well.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1201 views
  • 3 likes
  • 4 in conversation