Tuesday, April 13, 2010

DBMS_STATS versus ANALYZE

The ANALYZE command and the DBMS_STATS package offer two different ways to collect statistics about the objects in your database. Oracle’s query optimizer uses these statistics when determining the most efficient way to perform a query


important differences, advantages with DBMS_STATS over ANALYZE are
  1. The DBMS_STATS package was introduced in Oracle 8i, and for the last few years Oracle Corporation has been strongly advising customers to use DBMS_STATS instead of ANALYZE so dbms_stats is the stated, preferred method of collecting statisttics
  2. DBMS_STATS has parallel statistics collection
  3. We can import/export/set statistics directly with dbms_stats
  4. Dbms_stats can analyze external tables, analyze cannot.
  5. The DBMS_STATS package can gather global statistics at multiple levels as specified by the granularity parameter.(partition-level and subpartition-level). In contrast, the legacy ANALYZE command collects statistics only at the lowest level and derives higher level statistics by aggregation. These aggregated statistics are sometimes less accurate than the global statistics since it is not possible to precisely determine the overlap of values across partitions. Statistics for the number of distinct values and density are particularly susceptible to inaccuracy.

    The optimizer uses global statistics as the basis of access path generation unless query predicates narrow the query to a single partition. Since most queries are unlikely to be this restrictive, it is important to have accurate global statistics. Gathering global statistics with the DBMS_STATS package is highly recommended.
  6. dbms_stats (in 9i) can gather system stats 
  7. DBMS_STATS gathering statistics only when existing statistics are stale
  8. it is easier to automate with dbms_stats (it is procedural, analyze is just a command)
  9. DBMS_STATS gathers statistics only for cost-based optimization; it does not gather other statistics. For example, the table statistics gathered by DBMS_STATS include the number of rows, number of blocks currently containing data, and average row length but not the number of chained rows, average free space, or number of unused data blocks.
  10. Most importantly, in the future, ANALYZE will not collect statistics needed by the cost-based optimizer.  

No comments: