BookmarkSubscribeRSS Feed
_Altons_
Calcite | Level 5
Hi all,

I have to create an analysis data mart based on a complex query and retrieve the last 20 months of data. Also the information has to be splitted monthly.

So my question is whether to create 20 different data sets i.e. dsnameyyyymm or a single one with 19 versions using the gennum option.

I am keen to do the gennum one but since I am working with EG and V9 engine am not sure whether this is the best option (I heard it'll be phased out at some point but not sure if this applies to V9 engines).

Any suggestions? or a different approach?

Thanks,

Alberto
4 REPLIES 4
Doc_Duke
Rhodochrosite | Level 12
Store the data in a single dataset with an indicator as to month. Then you can use a parameter (e.g. macro variable) to subset for the particular analysis set that you want to do. That is much easier than managing 20 datasets.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Within SAS Base, I humbly disagree, especially considering a possible data/observation count scaling issue - you may want to consider using SAS PROC DATASETS and the AGE processing to create/maintain a defined set of "cycles". Of course, if the cycle location of an observation (for a given MONTH or whatever selection you may use) is undetermined, then AGE / cycles will likely not work effectively and a single file with a WHERE statement for filtering will be more applicable.

Regardless, hopefully this type of "master" file/cycle approach warrants a "frequent" backup copy, in case a recovery/restore situation is warranted?

Scott Barry
SBBWorks, Inc.
_Altons_
Calcite | Level 5
Thanks for the replies!

Scott,

The AGE statement does the same kind job as the gennum but rather than have #001 to #020 I will have dst01, dst02.... dst20, the AGE statement is used to move data to have last 20 months (in my case) datasets, hence dst20 which was keeping the 21th month of data is deleted and all data sets are shifted by 1 and now the dst01 keeps data for the most recent month and dst20 keeps the 20th month data

am I right?

I forgot to mentioned but each data set is 10 millions obs in average that why I was thinking on the gennum option.

Is there any sugi SGF paper(s) you would recommend me?

Thanks,

Alberto
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
It's totally your control - the content of a given cycle 00 through "nn" will contain whatever data you decide. In my experience the 01 cycle contains the prior month's data, as explained.

Scott Barry
SBBWorks, Inc.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1098 views
  • 0 likes
  • 3 in conversation