SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

"View" Index after Union Join?

Reply
Occasional Contributor
Posts: 17

"View" Index after Union Join?

Hello,

Would anybody know if possible to keep the indexes on a "view" resulted from a union join of 2 tables which have 3 indexes?

I have to create the view immediatley after the join without creating any intermediary tables (the table would be aprox 9Gb ). The two tables I'm union are smaller parts of the same table which have been split to avoid data problems.

Do you think that using append instead of union would keep the indexes?

I'm using SAS DI 3.4.

Thank you,

Madalin 

Respected Advisor
Posts: 4,919

Re: "View" Index after Union Join?

I don't think that you can create an index on a view. Proc Append would keep the index, see

http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a000235213.htm

PG

PG
Super User
Posts: 5,424

Re: "View" Index after Union Join?

Yes, the index will be maintained in the target table (that is the table that you append data to). So it would be most efficient to append the smaller table to the larger one.

I'm not sure what you wish to accomplish. Please describe a little a bit more about the original table, and the data problems you have (and why you can union/append them anyway.....?).

And what about updates - how often do you need to recreate those tables? Are there any other performance implications?

Data never sleeps
Occasional Contributor
Posts: 17

Re: "View" Index after Union Join?

More info:

We have a large daily dataset that is updated every day which store about 5 years data (the file is aprox 7Gb while the index is about 1Gb). Based on this table we build two views: one weekly and one monthly. Because recently the dataset has start making problems and restoring the dataset from backups takes about 6 hours we have decided to split the dataset in two tables: a rolling 12 month file and a history file. In the current process we use 3 indexes. Because the views use all the data we need to union or append the two tables created and build the views. In the current process the views get the indexes from the parent dataset but the problem I have now is that after union join the 2 tables the views are not indexed.

Is there a way to "transfer"  the indexes to the views after the union join?

Super User
Posts: 5,424

Re: "View" Index after Union Join?

I presume we are talking about Base SAS tables, right?

I don't think I can/want to interfere with your current strategy on handling your history/R12 table - even it would be tempting to propose to fix your update/maintenance problem instead... Smiley Wink

As stated before, Base SAS cannot maintain indexes for views.

In some case though, index in an underlying table can be used anyway. I think the term is that where clause is getting merged between the qury and the view. I think this could be tested quite easily, just see what response time you will get asking the view something that will result in an indexed based where-clause if queried directly to the underlying table.

Fun fact: if your data should reside in SAS SPD Server, or an external RDBMS, such as Oracle, you could probably use somthing called Materialized Views.

Data never sleeps
Occasional Contributor
Posts: 17

Re: "View" Index after Union Join?

In fact I'm using SAS Data Integration Studio v 3.4.

Would this make any difference?

Super User
Posts: 5,424

Re: "View" Index after Union Join?

No, DIS is just a GUI for defining your tables, views and jobs.

And I don't think there are any major differences between Base SAS versions (9.1.3 - 9.3) in this area (index handling).

Data never sleeps
Respected Advisor
Posts: 4,173

Re: "View" Index after Union Join?

Using DIS3.4 means you're having SAS 9.1.3

It's not the DIS version but what foundation SAS 9.1.3 is able to provide. For your problem I don't believe anything substantial changed in later versions.

An index is something "physical" and you can't create an index for a view in SAS - it's tied to the physical table.

So it's all about organising your physical tables in a way that they support optimally your performance requirements. I assume this is about querying tables.

One way to go: You could store your tables as SPDE data sets. This should give you better performance. Ideally you're using more than one physical location (folder) on more than one disk to store the SPDE "fragments". This would give you increased I/O performance.

There are some restrictions on how you can load/modify SPDE datasets - but using DI Studio the code generated will be SQL so everything will be fine.

I believe using SPDE could resolve your performance issues. You still can create views over these SPDE dataset - but as it is only a single SPDE dataset the views could actually utilise the indexes when executed.

Ask a Question
Discussion stats
  • 7 replies
  • 393 views
  • 0 likes
  • 4 in conversation