BookmarkSubscribeRSS Feed
TurnTheBacon
Fluorite | Level 6

Hello,

I'm fairly new to using SAS software, and would appreciate some general directions on a very basic problem.

Imagine that you have a cube with a star schema that stores data on customer relations, and that you're asked to provide end-users with history/snapshots of this data that they can analyze by months and by quarters. The program must provide snapshots of the past 10 months or so, and capture a new snapshot by the end of each future month. All snapshots must be from the 30th/31st, whichever is the last day of the month.

How would you typically approach such a problem? And would you make two programs: one for capturing historical data 10 months back, and one for automatically capturing future data as it becomes available? I guess those would be two separate data flows whose tables would be parsed, in the same ETL job.

I'd especially appreciate advice on what kind of code I might use for the WHERE statement of the first Extract transformation in such a situation, in order to get the correct monthly snapshots from the source table.

Definitely need some more experience in order to wrap my head around this. In any case, thanks for your attention.

4 REPLIES 4
Astounding
PROC Star

One of the keys will be understanding how the INTNX function works.  It returns the first day of a time period.  If we were dealing with SAS data sets, and variables stored as SAS dates, this would be a possibility:

data want;

   set have;

   where (intnx('MONTH', "&sysdate9"d, -12) <= snapshot_date < intnx('MONTH', "&sysdate9"d, 0));

run;

In this example, &SYSDATE9 is an automatic macro variable (text string) representing the date the program began, and "&SYSDATE9"d translates that value into SAS's date scale.  The INTNX function identifies the first day of the month from 12 months ago, and the second INTNX function identifies the first day of the current month.

This will have to be adapted to tables (which typically contain values that are not on the SAS date scale (either date times or text strings as snapshot dates).  And rather than using the current date you may want to let the user enter a date.  But either way, this concept would be a good place to start.

Good luck.

TurnTheBacon
Fluorite | Level 6

A very helpful answer. INTNX definitely seems like the key to solving the problem. Thank you. Smiley Happy

TurnTheBacon
Fluorite | Level 6

I've set up a job that successfully creates a snapshot from the end of a single month, using the INTNX function. The month is determined by means of a parameter in the job properties. If the parameter is set to -1, the job creates a snapshot of the previous month (August); if set to -2 it creates a snapshot of July, etc.

I have a problem deciding what to do next, though. I figure I should create an iterative job that places my job inside a loop and runs it until there's no more historical data available (February 2009), and that I should end up with a data set where each monthly snapshot has been appended together. Does that sound correct?

TomKari: Thanks, that makes sense, though if I understand correctly someone else will handle things like hierarchies and MDX after I've finished this part. I'm working exclusively in DI Studio now.

TomKari
Onyx | Level 15

Typically, cube / OLAP environments are used for exploratory analysis, in other words the person requiring information uses a tool such as the OLAP viewer in Enterprise Guide or Excel to dynamically drill around the cube looking for information. The language used to do this is MDX.

If I understand your request correctly, you would like to extract data on a regular basis, using criteria such as "extract monthly summaries for dates between x and y". I would suggest the following approach:

1. Using Excel or Enterprise Guide, explore your cube to get exactly the information on the screen that you would like in your extract.

2. Capture the MDX code that retrieves that information. In EG, you "create a slice". If you examine the properties of the slice, you'll see the MDX code.

3. Create a new library for your cube, and send your MDX code to it as a "pass-through" query. That should return your desired data in the form of a SAS dataset, which can then be analyzed, reported on, distributed, etc.

4. Once all of this works, change the variable parts of your query (such as start and end dates) into parameters / macro variables, that can then be created as part of your scheduled ETL process.

Tom

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2229 views
  • 8 likes
  • 3 in conversation