One big data set or multiple data sets

Ronein · Posted 08-06-2018 06:05 AM

Hello

I want to ask a question about efficiency.

In my work every month a data set (raw data) of Loans balances is created.

What is more efficiency way:

option1- Every month create another raw data data set

option2- Every month add new rows to the previous month data set.

What do you think is a better way to do it?

Why?

You need to know that lots of queries will be written based on these raw data tables.

Thanks

Joe

Kurt_Bremser · Posted 08-06-2018 06:11 AM

If your queries/reports are mainly run for single months, it makes sense (performancewise) to keep the datasets separate.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

RW9 · Posted 08-06-2018 06:13 AM

Splitting same data is rarely a good idea. Think of it this way, if you have multiple datasets split by date you:

1) have lots more files with their own header block - takes more space

2) have to write messy code to process lots of different dataset names - takes more time an resources to do so and is less stable

3) have code which is opening, closing, writing etc. data lots of times, inefficient for processing power

Hence the conclusion I would come to is no, don't split the data out into sections, have one single table with a fixed simple structure which is very easy to program against, query etc. Just look at how the big databases do it, Oracle has a few tables with simple structures to hold lots of data.

As a bit of a disclaimer, this is general, if your looking at something more specific like big data, or disparate data then there are other models for such a thing. But in general, for most tasks, keep it simple, keep it generic.

One big data set or multiple data sets

Re: One big data set or multiple data sets

Re: One big data set or multiple data sets

Catch up on SAS Innovate 2026

One big data set or multiple data sets

Re: One big data set or multiple data sets

Re: One big data set or multiple data sets

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away