Indexing – BI Insight

Oracle Indexing

Indexing is one of the most frequent approaches when resolving query performance issues raised within a database (though not necessary the right approach, but we can pick this up later).

However, in order to better use indexing strategies, we should go trough the process on understanding index types and their functionality.

First of all, please keep in mind that an index is logically and physically independent of the data they represent. This implies that modifying the index will not affect the data consistency within the table the index is associated with.

It can be created on one or more columns of a table to enable queries to retrieve a small set of randomly distributed rows while reducing the cost of that operation, by reducing the IO associated with the alternative full table scan.

The general considerations for creating an index would be

Unique indexes for candidate unique/pk columns, to enable naming the index when creating the associate constraint on the table;
a referential constraint column
columns used in frequent queries with high selectivity (columns on which the filters applied would enable the return of a small percentage of the rows in the table).

! Note: Primary and unique keys automatically have indexes, but you might want to create an index on a foreign key.

Indexes are automatically maintained by the database with no additional action required by the user. This however, does not imply that an index comes without costs. Always, indexes will improve query performance, but decrease performance on data manipulations. That is due to the fact that any insert/update/delete will have to maintain both objects: the table the DML is submitted on as well as the index update.

When testing an indexing strategy, a developer can take advantage of the following properties of the indexes:

usability
visibility

Usability: Indexes are by default usable. An unusable index will both not be maintained by the DML operations, nor will it be used by the optimizer. This property can help improve performance on bulk loads. Instead of dropping and recreating the index, we can easily make it unusable and then rebuild.

! Note: Unusable indexes and index partitions do not consume space. When you make a usable index unusable, the database drops its index segment.

Syntax:

to set an index to unused:

alter index test_idx unusable;

unusable

to rebuild index:

alter index test_idx rebuild;

valid after rebuild

Visibility: Indexes are by default visible. An invisible index will still be maintained by DML operations but will not be used by the optimizer. Invisible indexes are especially useful for testing the removal of an index before dropping it or using indexes temporarily without affecting the overall application.

alter index test_idx invisible;

invisible

alter index test_idx visible;

to restore the index.

Index types (based on column number):

single key index
composite index

Index types (based on data content):

unique
nonunique

Nonunique indexes permit duplicates values in the indexed column or columns. For a nonunique index, the rowid is included in the key in sorted order, so nonunique indexes are sorted by the index key and rowid (ascending).

!Note: Oracle Database does not index table rows in which all key columns are null, except for bitmap indexes or when the cluster key column value is null.

Index types (based on structure of the index):

B-tree (balanced tree index) (standard type)
- Index Organized Table (IOT)
- Reverse key index
- Descending index
- B-tree cluster index
bitmap and bitmap join index
function based index

B-tree:

excellent for PK and highly-selective indexing
data retrieved sorted by the indexed columns

By associating a key with a row or range of rows, B-trees provide excellent retrieval performance for a wide range of queries, including exact match and range searches.

IOT – this table differs from classical (heap-organized) table by the fact the data is in the index itself.

For more details on IOTs, please see related article here.

B-tree cluster indexes – is used to index a table cluster key. Instead of pointing to a row, the key points to the block that contains rows related to the cluster key.

In a bitmap index, an index entry uses a bitmap to point to multiple rows. In contrast, a B-tree index entry points to a single row. A bitmap join index is a bitmap index for the join of two or more tables.

Oracle Index Organized Table(IOT)

What is an IOT?

An IOT implies having your entire table indexed, within the same B-Tree structure. Should be used for larger scale tables or dimensions, highly used within the database.

How to?

Accessing the data is done via the primary key, much faster as the key and the data reside in the same structure. Also, lack of duplicaiton on key columns imply the total storage requirements are reduced.

To create an index organized table you must:

Specify the primary key using a column or table constraint.
Use the ORGANIZATION INDEX.

In addition you can use INCLUDING to define which non-key columns are stored with the key columns in the head piece, should overflow be necessary.

Example:

CREATE TABLE test_iot
 (id NUMBER(10)
, description VARCHAR2(50) NOT NULL
, comments varchar2(4000) 
, CONSTRAINT pk_test_tb PRIMARY KEY (id) ) 
ORGANIZATION INDEX 
TABLESPACE iot_tablespace 
INCLUDING description 
OVERFLOW TABLESPACE overflow_tablespace;

What to keep in mind when using IOT

Having a B-Tree index behaviour, the IOTs can become fragmented, or there might be other reasons you will need to rebuild the index. With no overflow it can be rebuild offline or online, while when it has overflow it can only be rebuit offline.

ALTER TABLE test_iot MOVE INITRANS 10;
--online rebuild
ALTER TABLE test_iot MOVE ONLINE INITRANS 10;
--offline with OVERFLOW
ALTER TABLE test_iot MOVE TABLESPACE iot_tablespace OVERFLOW TABLESPACE overflow_tablespace;

Oracle Function Based Index

What is a Function-Based Index?

A function-based index is an index created on the result of a function or expression.

This type of index is more used for situations with atypical searches on the specified columns (e.g. search on an user email address ignoring sensitivity can be optimised using a function based index; rounded up amounts.)

Frequent use cases?

This type of index is not behaving like our documented bitmap index, therefore the use cases will actually differ a lot. It can easily be used in cases of high cardinality distinct values, where the searches follow a particular generic pattern, but the data input can vary to a high degree.

Example

Consider the following table as a subscription table for email notifications on your particular site.

create table test_tb
( row_id number
, email_address varchar2(100)
, status char(6)
, gender varchar2(20)
, age number
, education varchar2(100)
, user_preferences varchar2(500)
);

create index test_bidx on test_tb(email_address);
create index test_fbidx on test_tb(upper(email_address));
create index test_fbidx_g on test_tb(upper(gender));

Now, if you are a marketing company wanting to analyse the data from muliple sites to asses users preferences for particular companies, you will want to be able to match the data for same user, so you can see the correlation between various data gathered. This is wherethe index will come in handy.

select count(*) as no_entries, upper(gender)
from test_tb
group by upper(gender);

Also, if you want to do an analysis based on email this can be very useful as a query like the one bellow will not use the test_bidx index, but the test_fbidx one:

select * from test_tb where upper(email_address) = 'no-reply@domain.com';

Oracle Bitmap Index

Common usage of Bitmap Indexes is a data warehousing environment. This implies large amounts of data, high level of ad-hoc queries but a low level of concurrent DLM transactions.

Why to use?

reduced response time
reduced storage requirements compared to other indexing techniques
efficient maintenance during parallel DML and load

Bitmap indexes are typically only a fraction of the size of the indexed data in the table.

How it works?

An index provides pointers to the rows in a table that contain a given key value. A regular index stores a list of rowids for each key corresponding to the rows with that key value. In a bitmap index, a bitmap for each key value replaces a list of rowids.

Each bit in the bitmap corresponds to a possible rowid, and if the bit is set, it means that the row with the corresponding rowid contains the key value. A mapping function converts the bit position to an actual rowid, so that the bitmap index provides the same functionality as a regular index. Bitmap indexes store the bitmaps in a compressed way. If the number of distinct key values is small, bitmap indexes compress better and the space saving benefit compared to a B-tree index becomes even better.

Best use case?

From my opinion, bitmaps are most effective on queries with multiple where clauses. As the bitmaps from bitmap indexes can be combined quickly, it is usually best to use single-column bitmap indexes. This is why the DW environment is the “home” for this type of indexing.

Frequent use cases?

The advantage of bitmap indexes is higher on columns where degree of cardinality (number of distinct values for the indexed column versus the total number of rows in the table) is small. Columns like Status, gender are optimal examples.

However, the datawarehouse environments can also benefit from bitmaps on columns with higher cardinality. This is also mostly due to the combining of bitmaps for quick filtering.

This is due to the fact that AND and OR conditions in the where clause can we resolved faster by performing Boolean operations directly on the bitmaps before converting them to rowids for data retrieval.

Using this bitmap merge methodology, Oracle can provide sub-second response time when working against multiple low-cardinality columns.

Unlike most other indexing, bitmap indexes include rows that have the null values.

Bitmap and Partitioning

You can create bitmap indexes on partitioned tables but they must be local to the partitioned table—they cannot be global indexes.

Example

create table test_tb
( row_id number
, text varchar2(100)
, status char(6));

create bitmap index test_bmpidx on test_tb(status);

On partitioned tables:

create table 
 t2
(c1 char(3) not null
, c2 date not null
, c3 number
, c4 varchar2(100))
partition by range(c2)
interval (numtodsinterval (1,'day'))
 (
 partition empty values less than (to_Date ('03-OCT-2016', 'dd-mon-yyyy'))
 )
;

create bitmap index test_bmpidx on test_tb(status);

Restrictions

You cannot create a bitmap join index on a temporary table.

Buffer Busy Waits – Reverse Key Index – Demo

We discussed earlier about indexing, and specifically reverse key indexing.

I mentioned that is a solution for buffer busy waits on numerical consecutive inserted keys (like the sequence based generated ones) .

This post will test that solution.

For regular index test, I’ve created the following table, sequence and associated index:

create table test_tb
( row_id number
, text varchar2(100));

create sequence test_seq start with 1 increment by 1 cache 10000;

create index test_nidx on test_tb(row_id);

and a simple procedure that will insert the rows:

create or replace procedure insert_n (p_rows number default 10000) as 
begin
 
 for i in 1..p_rows loop
 insert into test_tb
 select test_seq.nextval, 'test row '||test_seq.currval from dual;
 
 end loop;
 commit;
end;
/

For the reverse key index test:

create table test_tb2
( row_id number
, text varchar2(100));

create sequence test_seq2 start with 1 increment by 1 cache 10000;

create index test_ridx on test_tb2(row_id);

and a simple procedure that will insert the rows:

create or replace procedure insert_r (p_rows number default 10000) as 
begin
 
 for i in 1..p_rows loop
 insert into test_tb2
 select test_seq2.nextval, 'test row '||test_seq2.currval from dual;
 
 end loop;
 commit;
end;
/

To test the effect of concurrent inserts, I’ve executed both procedures from 3 sessions (please note in my case number of sessions is small, but will still show us the impact of the two types of indexes):

execute insert_n(100000);

execute insert_n(100000);

Now looking at the buffer busy waits we had we see a reasonable benefit already of the reverse key index:

buffer busy waits - SQL

select object_name, value, statistic_name,object_type
from v$segment_statistics
where owner='C##INSIGHT'
and object_type='INDEX'
and statistic_name='buffer busy waits'
and object_name like 'TEST%'
;

Oracle Reverse Index

This is one of the least used Oracle indexes I’ve encountered in a data-warehousing environment. However, a very good instrument in performance problems on OLTPs.

To understand a reverse key index, we must first look at the regular b-tree index on a numerical key, and its storage in the database memory. The index will store, in ascending sort order, all it’s keys on disk. If we consider a key generated by a sequence, then any new value for that key will therefore be inserted in the last database block.

On a high frequency insert setup system in this type of case, you will encounter the buffer busy waits event one too many times.

To see the test on this events, please see post on Buffer Busy Waits – Reverse Key Index – Demo

Now, on considering a reverse key index, the key stored is actually the reversed value. For instance, for values 12, 13, 14 the stored keys will be 21, 31 and 41. This type of index will scatter its consecutive key inserts into multiple blocks, removing the hot-block problem regular keys have.

!!! Please note there is a downside to this approach, meaning the index will no longer allow range scans.

How to create a reverse key index:

create index INDEX_NAME on TABLE_NAME(INDEX_KEY) reverse;

It is also possible to change an existing index, making it reverse:

alter index INDEX_NAME rebuild reverse;

reversed btree

And also to revert a reverse key index into a regular index:

alter index INDEX_NAME rebuild noreverse;

regular btree

Oracle: Partitioning and Indexes

As I’ve been presenting in the last couple of posts on partitioning, one of the major benefits of this database option is basically the selectivity when filtering on the partition key, what we currently call partition pruning.

Now, what we’ve been experimenting with so far are simple based table. However, in the data warehousing world, things are not so simple.
One of the most common “performance fixes” that we see in DW is indexing. However, indexing is not always good, and combined with partitioning, it might prove to be a very unfortunate combination, if not done properly.

Partitioning world gives us two options in indexing:

Local indexing
Global indexing

Basically, a local index will behave similar to a partition table, being split into partitions and best performing within partition pruning cases.
The global index, on the other side, will behave like a regular index, with one particularity: when used, it will disable the partition pruning.

Basically, from my tests so far, global indexes and partition pruning don’t mix.

How and Which

The following section will provide you a simple tested guideline on how to mix these two database performance hacks to reap the highest value out of your implementation.

Unique indexes should be global – when we use unique indexes, the selectivity provided by them should be much higher than the one from partition selectivity

Non-unique indexes – local indexes – for more skewed data across partitions. This is perfect in combination with partition pruning.

Scenario 1: Global non-unique index

Use the table T1 created as part of post Oracle: Partition by List – Example

Create index on the date column:

SQL:

create index t1_dt on t1(c2);

Now looking at regular SQL for selecting a partition and then filtering on the date column:

select * from t1
where 1=1
and c1 in ('ACD')
and c2 between trunc(sysdate) and trunc(sysdate+1);

global-non-unique-index

Create local index on the date column (drop the prior created index):

SQL:

drop index t1_dt;
create index t1_dt on t1(c2) local;

Now looking at same SQL for selecting a partition and then filtering on the date column:

select * from t1
where 1=1
and c1 in ('ACD')
and c2 between trunc(sysdate) and trunc(sysdate+1);

local-non-unique-index

	idatcu on Oracle: Partition by List Sub-…
	Nazrul on Oracle: Partition by List Sub-…
	Oracle Indexing \| BI… on Oracle Function Based Ind…
	Oracle Indexing \| BI… on Oracle Index Organized Ta…
	Sam on Undo drills and view prom…