CS848 Presentation of
:
►The general definition of the problem:
»Given: a workload and space constraint (B)
»Output: the set of phy. str. (a configuration) fits in B
►The paper criticizes previous work, e.g. papers #1,3,4.
»Too many assumptions and heuristics
»Loose interaction with the optimizer in candidate selection
»Independence between enumeration and merging
►Assumptions (made in related work):
»What-if API (hypothetical physical structures)
»Dependence of optimizer
»Search framework
►An optimizer, after getting a logical query
»Issues requests for indexes and materialized views (via access path generation module)
-All sargable predicates - Additional columns
-Sort columns - SPJG sub-queries
►Each time optimizer issues a request
»Suspend the optimization (only in tuning mode)
»Analyze the request
»Simulate the physical structure
q No guessing
q Don’t miss candidates that might not be obvious at the final plan
q Don’t propose not exploitable candidates
How to get the best configuration-2
►Union of all optimal structures is the best configuration
► Number of requests made by optimizer is not overwhelming
Advantages of knowing the best configuration
► A configuration is transformed into a smaller but less efficient new configurations
»Re-optimization of a relaxed configuration is more efficient
► More useful information to DBA:
Transformations
► C -> C’, a subset of structures in C are replaced
► Indexes
»Merging: I’=(K1,(S1 U K2 U S2)-K1)
»Splitting: re-arrange overlapping columns of wider indexes
Ic=( , ), Ir1=(K1-KC,I1-Ic), Ir2=(K2-KC,I2-Ic)
»Prefixing: I1’=(K1’, ).
»Promotion to clustered
»Removal
► Views
»V1,V2 -> Vm
»Removal
► For each valid tr,
»C -> Ctr
»Decrease in space:
»Increase in cost:
“ ”
►Relax C into Cnew using the transformation tr which minimizes the penaltytr.
► The configuration with minimum cost
»Impractical, too long to converge
► Instead,
»Use greedy approach till we satisfy the space constraint
“remember time limitation”
»At the end, we have a chain of relaxed configurations
-Pick the configuration resulted in the largest penalty when relaxed
“Correct the transformation”
►How to find cost of a workload and space of a config.
»Materialize to find space consumed by a configuration
»Re-optimize (simulate) the workload under configuration
“It’s too expensive to call the optimizer”►
Estimating Cost and Space of a configuration
Approximate values for space and cost
►Space
»Index: use width of keys and size of table, estimate B-tree size
»
Views: Index + estimate from “cardinality optimizer” info
query
►Previous discussion ignores update queries (UPDATE, DELETE,INSERT queries)
»Update queries affect indexes and views
►Adding more index and views would reduce execution cost is not a true assumption
»Index-view update overhead
»Each update query implicitly references all indexes over its referenced table (not just 1 index)
»Cascading effect ?
►Now queries which don’t use the indexes of an update query are affected indirectly.
►An update query is divided into two,
»A pure SELECT query
»A small UPDATE shell
►Remember the goal is to estimate the cost!!!
►For each index-I and update query-q
»Find the cost of updating I for q
»While estimating a configuration
-Use just SELECT portion (treat as a normal query)
-Then estimate the cost of indexes referenced in the query
0 comments:
Post a Comment