BookmarkSubscribeRSS Feed
0 Likes

Hi,

I think it would be beneficial to have Generation Data group like dataset in SAS similar to the format used in Mainframe.

  • Generation Data Groups (GDGs) are group of datasets related to each other by a common name.
  • The common name is referred as GDG base and each dataset associated with the base is called a GDG version.
  • You can set the limit of the related files(generations).
  • We can easily keep track of all generation of data sets.
  • Any particular generation can be referred easily.

We use lot of datasets which regenerates daily , monthly , yearly etc..

 

Consider I need to create daily transaction data for the month of May.

If Generation Data Group is available, I would define the base like "Tran_May2024". 

Then I would just create dataset daily in the below way.

 

Data Tran_May2024(+1); -- This will create the next available version. 

set work_tran_table;

run;

 

If need to point current or earlier version  i would use the dataset below.

Tran_May2024(0) - Latest Version 

Tran_May2024(-1) - Previous Version 

Tran_May2024(-2) - 2 versions back

 

If I need the whole month data ,  i just refer the base 

 

Data work_tran_may;

set Tran_May2024;   --This would all the version available in the base 

run;

 

you can have options like below

 

LIMIT – To limit the maximum number of generations.

 

NOEMPTY – Uncatalog only the oldest generation in GDG when the limit is reached.
EMPTY – Uncatalog all the generations when a limit is reached.

SCRATCH -Physically delete the dataset(generation) which is uncataloged.
NOSCRATCH – Don’t Physically delete the dataset(generation) which is uncataloged.

 

This would help a lot when we create lot of datasets which are created in a repeatable fashion. 

 

Thank

Ravi 

 

 

 

 

 

 

 

2 Comments
ballardw
Super User

SAS has generation data sets.

Search the online help for the phrase: "Generation Data Sets"

 

Syntax to use them may differ but they are there.

Patrick
Opal | Level 21

See here: Understanding Generation Data Sets

 

I had never a use case where I considered generation datasets advantageous. I find it much easier to work with tables that got a date component in the name like base_name_<yyyymmdd>

If you then need all the data for a month you just use syntax like: set Tran_May_202405:;

 

I consider working with generation datasets especially cumbersome when there is a need for a reload - which is absolutely no problem with table names that got a date component in it.