标签云

微信群

扫码加入我们

WeChat QR Code

Given that indexing is so important as your data set increases in size, can someone explain how indexing works at a database-agnostic level?For information on queries to index a field, check out How do I index a database column.


binary search can be done when the data is unique, am i right? although you mentioned that minimum cardinality is important,the algorithm wouldn't be a simple binary search, how would this approximation (~log2 n) affect the process time?

2019年06月25日40分10秒

AbhishekShivkumar:Great question!I think the index table will have as many rows as there are in the data table. And as this field will have only 2 values(boolean with true/false) & say you want a record with value true,then you can only halve the result set in first pass, in second pass all your records have value true so there is no basis to differentiate,now you have to search the data table in linear fashion-hence he said cardinality should be consideredwhile deciding the indexed column. In this case,it's worthless to index on such a column. Hope I'm correct :)

2019年06月25日40分10秒

shouldn't the number of block accesses in the average case be (N+1)/2. If we sum the number of block accesses for all possible cases, and divide it by the number of cases, then we have N*(N+1)/(2*n) which comes out to be (N+1)/2.

2019年06月26日40分10秒

I think there are a few typos in this answer, for example, in the sentence: "a far cry from the 277,778 block accesses required by the non-indexed table." doesn't the author mean 1,000,000 block accesses? 277,778 is the number of blocks required by the index itself. There seems to be a couple of other inaccuracies too :(

2019年06月25日40分10秒

jcm He explained it in the "What is indexing section" - "Indexing is a way of sorting a number of records on multiple fields. Creating an index on a field in a table creates another data structure which holds the field value, and pointer to the record it relates to. This index structure is then sorted, allowing Binary Searches to be performed on it."

2019年06月25日40分10秒

I think, these indexing issues can be resolved by maintaining two different databases, just as Master and Slave. Where Master can be used to insert or update records. Without indexing. And slave can be used to read with proper indexing right???

2019年06月25日40分10秒

no, wrong, sorry. not just the content of the tables must be updated, but also the index structure and content (b-tree, nodes). your concept of master and slave makes no sense here. what can be feasable though is replicating or mirroring to a second database on which analytics take place to take that workload away from the first database. that second database would hold copies of data and indexes on that data.

2019年06月25日40分10秒

Ya...! Try to read my comment and understand it properly. I also said the same, I referred to master and slave (whatever) as "eplicating or mirroring to a second database on which analytics take place to take that workload away from the first database. that second database would hold copies of data and indexes on that data"

2019年06月25日40分10秒

the second database - to which mirroring or replicating is done, the slave - would experience all the data manipulation as the first one does. with each dml-operation the indexes on that second database would experience "these indexing issues". i don't see the gain in that, where ever the indexes are needed and built for quick analysis they need to be kept up to date.

2019年06月25日40分10秒

+1 times a million for this answer, as I found this listing while trying to find a simple explanation what indexing essentially is.

2019年06月26日40分10秒

really nice analogy! funny i didn't make the connection between a book index and a db index

2019年06月25日40分10秒

This makes me think Library or Grocery StoreCould you image not having an index at a grocery store?Where's The Beef?!? Oh its next to the Restrooms, a mop, and makeup

2019年06月26日40分10秒

"But with an index page at the beginning, you are there." What does "you are there" mean?

2019年06月26日40分10秒

"a database index does not store the values in the other columns " -- not true.

2019年06月26日40分10秒

mustaccio: Index stores reference of row with the indexed columns only (as far I know). I might be wrong. Do you have any reference which says index stores other columns values?

2019年06月26日40分10秒

To Downvoters : Can you just explain what's wrong so that I can improve?

2019年06月25日40分10秒

Check for example SQL Server clustering indexes or DB2's CREATE INDEX ... INCLUDE clause. You have too many generalizations in your answer, in my view.

2019年06月25日40分10秒

mustaccio: So by default create index does not include the other columns and why it should. If we did just store all the other columns in the index, then it would be just like creating another copy of the entire table, which would take up way too much space and would be very inefficient.. This is more generalized version of indexes. CREATE INDEX ... INCLUDE is the newer version by considering other columns. Post I have explained is considering more generalized version. How indexes work would be one book if we consider all the databases? Isn't it? Do you think answer deserves downvote?

2019年06月26日40分10秒

An index doesn't imply sorting order on the column

2019年06月25日40分10秒

Thanks. This helped my understanding. So basically an index is a replica of the column data that has been sorted. Normally the column data is just in the order the data was inserted.

2019年06月26日40分10秒

This is a comment, not an answer.

2019年06月26日40分10秒

It's more visible and thus more helpful this way as it is a general remark. Which answer should have this been added to as a comment?

2019年06月25日40分10秒