본문 바로가기
푸닥거리

M201 MongoDB Performance

by [김경민]™ ┌(  ̄∇ ̄)┘™ 2022. 6. 11.
728x90

 

 

Chapter 1: Introduction

Introduction to MongoDB Performance

 

Which of the following statements is/are true?

-> Indexes will be a major topic covered in this course.

 

 

Chapter 1: Introduction

Hardware Considerations & Configurations Part 1

 

Chapter 1: Introduction

Hardware Considerations & Configurations Part 2

 

Regarding the performance implications, which of the following statements are correct?

- Disk RAID architecture can impact the performance of your MongoDB deployment.

- CPU availability impacts the performance of MongoDB, especially if using the WiredTiger storage engine.

 

 

Chapter 1: Introduction

Lab 1.1: Install Course Tools and Datasets

 

mongodb+srv://<username>:<password>@cluster0-jxeqq.mongodb.net/test

 

 

https://www.mongodb.com/docs/database-tools/?_ga=2.176235042.688064568.1654342381-173819729.1653806116 

 

The MongoDB Database Tools Documentation — MongoDB Database Tools

Docs Home → MongoDB Database ToolsThe MongoDB Database Tools are a collection of command-line utilities for working with a MongoDB deployment. The Database Tools include the following binaries:Starting with MongoDB 4.4, the MongoDB Database Tools are now

www.mongodb.com

 

 

.\mongoimport.exe --drop -c people --uri mongodb+srv://formin:password@cluster0.tpsxlft.mongodb.net/test people.json

 

 

 

 

 

mongodb+srv://<username>:<password>@cluster0-jxeqq.mongodb.net/test

 

$ # Example for my Atlas cluster, you will need to change the

$ # user, password and Atlas cluster name

 

$ mongoimport --drop -c people --uri mongodb+srv://theusername:thepassword@m201-nhtgi.mongodb.net/m201 people.json

 

$ mongoimport --drop -c restaurants --uri mongodb+srv://theusername:thepassword@m201-nhtgi.mongodb.net/m201 restaurants.json

 

mongodb://localhost:27017

 

$ # Example for my local cluster running on port 27017

 

$ mongoimport --drop -c people --uri mongodb://localhost:27017/m201 people.json

 

$ mongoimport --drop -c restaurants --uri mongodb://localhost:27017/m201 restaurants.json

 

> use m201

> db.people.count({ "email" : {"$exists": 1} })

 

 

 

 

 

Chapter 2: MongoDB Indexes

Introduction to Indexes

 

https://www.mongodb.com/docs/manual/indexes/?jmp=university 

 

Indexes — MongoDB Manual

Docs Home → MongoDB ManualIndexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an ap

www.mongodb.com

 

 

 

Which of the following statements regarding indexes are true?

- Indexes are used to increase the speed of our queries.

- The _id field is automatically indexed on all collections.

- Indexes reduce the number of documents MongoDB needs to examine to satisfy a query.

- Indexes can decrease write, update, and delete performance.

 

 

 

 

Chapter 2: MongoDB Indexes

How Data is Stored on Disk

 

 

 

 

https://www.mongodb.com/docs/manual/faq/storage/?jmp=university 

 

FAQ: MongoDB Storage — MongoDB Manual

Docs Home → MongoDB ManualThis document addresses common questions regarding MongoDB's storage system.A storage engine is the part of a database that is responsible for managing how data is stored, both in memory and on disk. Many databases support multi

www.mongodb.com

 

 

Chapter 2: MongoDB Indexes

Single Field Indexes Part 2

 

https://www.mongodb.com/docs/manual/core/index-single/?jmp=university 

 

Single Field Indexes — MongoDB Manual

Docs Home → MongoDB ManualMongoDB provides complete support for indexes on any field in a collection of documents. By default, all collections have an index on the _id field, and applications and users may add additional indexes to support important quer

www.mongodb.com

 

 

 

 

 

Which of the following queries can use an index on the zip field?

- db.addresses.find( { zip : 55555 } )

 

 

Chapter 2: MongoDB Indexes

Understanding Explain

 

If you observe the following lines in the output of explain('executionStats'), what can you deduce?

 

"executionStats" : {

"executionSuccess" : true,

"nReturned" : 23217,

"executionTimeMillis" : 91,

"totalKeysExamined" : 23217,

"totalDocsExamined" : 23217,

"executionStages" : {

"stage" : "SORT",

"nReturned" : 23217,

"executionTimeMillisEstimate" : 26,

"works" : 46437,

"advanced" : 23217,

"needTime" : 23219,

"needYield" : 0,

"saveState" : 363,

"restoreState" : 363,

"isEOF" : 1,

"sortPattern" : {

"stars" : 1

},

"memUsage" : 32522511,

"memLimit" : 33554432,

 

- The index selected for the query was not useful for the sort part of the query

- An in-memory sort was performed

- The system came very close to throwing an exception instead of returning results

 

 

 

Chapter 2: MongoDB Indexes

Understanding Explain for Sharded Clusters

 

 

 

With the output of an explain command, what can you deduce?

- The index used by the chosen plan

- If a sort was performed by walking the index or done in memory

- All the different stages the query needs to go through with details about the time it takes, the number of documents processed and returned to the next stage in the pipeline

 

 

 

Chapter 2: MongoDB Indexes

Sorting with Indexes

 

 

https://www.mongodb.com/docs/manual/tutorial/sort-results-with-indexes/?jmp=university 

 

Use Indexes to Sort Query Results — MongoDB Manual

Docs Home → MongoDB ManualSince indexes contain ordered records, MongoDB can obtain the results of a sort from an index that includes the sort fields. MongoDB may use multiple indexes to support a sort operation if the sort uses the same indexes as the q

www.mongodb.com

 

 

 

 

Given the following schema for the products collection:

 

{

"_id": ObjectId,

"product_name": String,

"product_id": String

}

 

And the following index on the products collection:

{ product_id: 1 }

 

Which of the following queries will use the given index to perform the sorting of the returned documents?

 

db.products.find({ product_id: '57d7a1' }).sort({ product_id: -1 })

db.products.find({ product_name: 'Soap' }).sort({ product_id: 1 })

db.products.find({}).sort({ product_id: 1 })

db.products.find({}).sort({ product_id: -1 })

 

 

 

Chapter 2: MongoDB Indexes

Querying on Compound Indexes Part 1

 

Chapter 2: MongoDB Indexes

Querying on Compound Indexes Part 2

 

 

https://www.mongodb.com/docs/manual/core/index-compound/?jmp=university 

 

Compound Indexes — MongoDB Manual

Docs Home → MongoDB ManualMongoDB supports compound indexes, where a single index structure holds references to multiple fields within a collection's documents. The following diagram illustrates an example of a compound index on two fields:Compound index

www.mongodb.com

 

https://www.mongodb.com/docs/manual/tutorial/create-indexes-to-support-queries/?jmp=university 

 

Create Indexes to Support Your Queries — MongoDB Manual

Docs Home → MongoDB ManualAn index supports a query when the index contains all the fields scanned by the query. The query scans the index and not the collection. Creating indexes that support queries results in greatly increased query performance.This d

www.mongodb.com

 

 

 

Chapter 2: MongoDB Indexes

When you can sort with Indexes

 

 

 

https://www.mongodb.com/docs/manual/tutorial/sort-results-with-indexes/?jmp=university 

 

Use Indexes to Sort Query Results — MongoDB Manual

Docs Home → MongoDB ManualSince indexes contain ordered records, MongoDB can obtain the results of a sort from an index that includes the sort fields. MongoDB may use multiple indexes to support a sort operation if the sort uses the same indexes as the q

www.mongodb.com

 

 

 

 

When you can sort with Indexes

- Index prefixes can be used in query predicates to increase index utilization.

- Index prefixes can be used in sort predicates to prevent in-memory sorts.

- We can invert the keys of an index in our sort predicate to utilize an index by walking it backwards.

 

 

 

 

Chapter 2: MongoDB Indexes

Multikey Indexes

 

 

 

https://www.mongodb.com/docs/manual/core/index-multikey/?jmp=university 

 

Multikey Indexes — MongoDB Manual

Docs Home → MongoDB ManualTo index a field that holds an array value, MongoDB creates an index key for each element in the array. These multikey indexes support efficient queries against array fields. Multikey indexes can be constructed over arrays that

www.mongodb.com

 

 

Given the following index:

{ name: 1, emails: 1 }

When the following document is inserted, how many index entries will be created?

{

"name": "Beatrice McBride",

"age": 26,

"emails": [

"puovvid@wamaw.kp",

"todujufo@zoehed.mh",

"fakmir@cebfirvot.pm"

]

}

-> 3

 

 

Chapter 2: MongoDB Indexes

Partial Indexes

 

https://www.mongodb.com/docs/manual/core/index-partial/?jmp=university 

 

 

 

 

Which of the following is true regarding partial indexes?

- Partial indexes represent a superset of the functionality of sparse indexes.

- Partial indexes can be used to reduce the number of keys in an index.

- Partial indexes support compound indexes.

 

 

 

Chapter 2: MongoDB Indexes

Text Indexes

 

 

Which other type of index is mostly closely related to text indexes?

- Multi-key indexes

 

 

 

 

Chapter 2: MongoDB Indexes

Collations

 

https://www.mongodb.com/docs/manual/reference/collation/?jmp=university 

 

Collation — MongoDB Manual

Docs Home → MongoDB ManualCollation allows users to specify language-specific rules for string comparison, such as rules for lettercase and accent marks.You can specify collation for a collection or a view, an index, or specific operations that support c

www.mongodb.com

 

 

Which of the following statements are true regarding collations on indexes?

- Collations allow the creation of case insensitive indexes

- We can define specific collations in an index

 

 

 

 

 

Chapter 2: MongoDB Indexes

Wildcard Index Type: Part 1

 

 

 

 

Chapter 2: MongoDB Indexes

Wildcard Index Type: Part 2

 

 

 

Using the wildcardProjection flag with Wildcard Indexes, we can:

- exclude a set of fields from the Wildcard Index.

- include a set of fields in the Wildcard Index.

 

 

 

 

 

Chapter 2: MongoDB Indexes

Wildcard Index Use Cases

 

 

 

 

Which of the following are good reasons to use a Wildcard Index?

- The query pattern on documents of a collection is unpredictable.

- An application consistently queries against document fields that use the Attribute Pattern.

 

 

 

 

Chapter 2: MongoDB Indexes

Lab 2.1: Using Indexes to Sort

 

In this lab you're going to determine which queries are able to successfully use a given index for both filtering and sorting.

Given the following index:

 

{ "first_name": 1, "address.state": -1, "address.city": -1, "ssn": 1 }

 

Which of the following queries are able to use it for both filtering and sorting?

- db.people.find({ "first_name": "Jessica" }).sort({ "address.state": 1, "address.city": 1 })

- db.people.find({ "first_name": "Jessica", "address.state": { $lt: "S"} }).sort({ "address.state": 1 })

- db.people.find({ "address.state": "South Dakota", "first_name": "Jessica" }).sort({ "address.city": -1 })

 

 

 

 

 

Chapter 2: MongoDB Indexes

Lab 2.2: Optimizing Compound Indexes

 

In this lab you're going to examine several example queries and determine which compound index will best service them.

 

> db.people.find({

"address.state": "Nebraska",

"last_name": /^G/,

"job": "Police officer"

})

 

> db.people.find({

"job": /^P/,

"first_name": /^C/,

"address.state": "Indiana"

}).sort({ "last_name": 1 })

 

> db.people.find({

"address.state": "Connecticut",

"birthday": {

"$gte": ISODate("2010-01-01T00:00:00.000Z"),

"$lt": ISODate("2011-01-01T00:00:00.000Z")

}

})

 

If you had to build one index on the people collection, which of the following indexes would best service all 3 queries?

-> { "address.state": 1, "last_name": 1, "job": 1 }

 

 

 

 

 

 

Chapter 3: Index Operations

Building Indexes

 

 

https://www.mongodb.com/docs/manual/core/index-creation/?jmp=university 

 

Index Builds on Populated Collections — MongoDB Manual

Docs Home → MongoDB ManualStarting in MongoDB 4.2, index builds use an optimized build process that holds an exclusive lock on the collection at the beginning and end of the index build. The rest of the build process yields to interleaving read and write

www.mongodb.com

 

 

Which of the following are true of index build operations?

- MongoDB now only has one index buid type available

 

 

 

 

Chapter 3: Index Operations

Query Plans

 

https://www.mongodb.com/docs/manual/core/query-plans/?jmp=university 

 

Query Plans — MongoDB Manual

Docs Home → MongoDB ManualFor a query, the MongoDB query optimizer chooses and caches the most efficient query plan given the available indexes. The evaluation of the most efficient query plan is based on the number of "work units" (works) performed by t

www.mongodb.com

 

 

Which of the following is/are true concerning query plans?

- Query plans are cached so that plans do not need to be generated and compared against each other every time a query is executed.

 

 

 

 

Chapter 3: Index Operations

Forcing Indexes with Hint

 

 

What is the method that forces MongoDB to use a particular index?

- hint()

 

 

 

 

 

Chapter 3: Index Operations

Resource Allocation for Indexes Part 1

 

Chapter 3: Index Operations

Resource Allocation for Indexes Part 2

 

 

 

Chapter 3: Index Operations

Resource Allocation for Indexes Part 3

 

 

 

https://en.wikipedia.org/wiki/B-tree

 

B-tree - Wikipedia

A self-balancing, tree-based data structure, that allows read/write access in logarithmic time In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletion

en.wikipedia.org

 

 

Which of the following statements apply to index resource allocation?

- For the fastest processing, we should ensure that our indexes fit entirely in RAM

- Indexes are not required to be entirely placed in RAM, however performance will be affected by constant disk access to retrieve index information.

 

 

 

 

 

Chapter 3: Index Operations

Basic Benchmarking Part 1

 

 

 

 

Chapter 3: Index Operations

Basic Benchmarking Part 2

 

What type of strategy and tools should we be using to performance benchmark a MongoDB installation?

- Publicly available tools, including correct database variations

 

 

 

In this lab you're going to determine which index was used to satisfy a query given its explain output.

 

> var exp = db.restaurants.explain("executionStats")

> exp.find({ "address.state": "NY", stars: { $gt: 3, $lt: 4 } }).sort({ name: 1 }).hint(REDACTED)

 

Which resulted in the following output:

 

{

"queryPlanner": {

"plannerVersion": 1,

"namespace": "m201.restaurants",

"indexFilterSet": false,

"parsedQuery": "REDACTED",

"winningPlan": {

"stage": "SORT",

"sortPattern": {

"name": 1

},

"inputStage": {

"stage": "SORT_KEY_GENERATOR",

"inputStage": {

"stage": "FETCH",

"inputStage": {

"stage": "IXSCAN",

"keyPattern": "REDACTED",

"indexName": "REDACTED",

"isMultiKey": false,

"isUnique": false,

"isSparse": false,

"isPartial": false,

"indexVersion": 1,

"direction": "forward",

"indexBounds": "REDACTED"

}

}

}

},

"rejectedPlans": [ ]

},

"executionStats": {

"executionSuccess": true,

"nReturned": 3335,

"executionTimeMillis": 20,

"totalKeysExamined": 3335,

"totalDocsExamined": 3335,

"executionStages": "REDACTED"

},

"serverInfo": "REDACTED",

"ok": 1

}

 

Given the redacted explain output above, select the index that was passed to hint.

- { "address.state": 1, "stars": 1, "name": 1 }

 

 

 

 

Chapter 4: CRUD Optimization

Optimizing your CRUD Operations

 

 

https://www.mongodb.com/docs/manual/tutorial/create-queries-that-ensure-selectivity/?jmp=university 

 

Create Queries that Ensure Selectivity — MongoDB Manual

Docs Home → MongoDB ManualSelectivity is the ability of a query to narrow results using the index. Effective indexes are more selective and allow MongoDB to use the index for a larger portion of the work associated with fulfilling the query.To ensure sel

www.mongodb.com

 

https://www.mongodb.com/docs/manual/tutorial/sort-results-with-indexes/?jmp=university 

 

Use Indexes to Sort Query Results — MongoDB Manual

Docs Home → MongoDB ManualSince indexes contain ordered records, MongoDB can obtain the results of a sort from an index that includes the sort fields. MongoDB may use multiple indexes to support a sort operation if the sort uses the same indexes as the q

www.mongodb.com

https://www.mongodb.com/docs/manual/tutorial/create-indexes-to-support-queries/?jmp=university 

 

Create Indexes to Support Your Queries — MongoDB Manual

Docs Home → MongoDB ManualAn index supports a query when the index contains all the fields scanned by the query. The query scans the index and not the collection. Creating indexes that support queries results in greatly increased query performance.This d

www.mongodb.com

 

 

 

 

When building indexes to service your queries, which of the following is the general rule of thumb you should keep when ordering your index keys?

Note, use the following definitions to for this question:

- equality: indexed fields on which our queries will do equality matching

- range: indexed fields on which our queries will have a range condition

- sort: indexed fields on which our queries will sort on

 

-> equality, sort, range

 

 

 

 

 

 

Chapter 4: CRUD Optimization

Covered Queries

 

https://www.mongodb.com/docs/manual/core/query-optimization/?jmp=university 

 

Query Optimization — MongoDB Manual

Docs Home → MongoDB ManualIndexes improve the efficiency of read operations by reducing the amount of data that query operations need to process. This simplifies the work associated with fulfilling queries within MongoDB.If your application queries a col

www.mongodb.com

 

 

 

 

 

Given the following indexes:

 

{ _id: 1 }

{ name: 1, dob: 1 }

{ hair: 1, name: 1 }

 

Which of the following queries could be covered by one of the given indexes?

- db.example.find( { name : { $in : [ "Bart", "Homer" ] } }, {_id : 0, dob : 1, name : 1} )

 

 

 

 

Chapter 4: CRUD Optimization

Regex Performance

 

Given the following index:

> db.products.createIndex({ productName: 1 })

And the following query:

> db.products.find({ productName: /^Craftsman/ })

-> The query will do an index scan.

 

 

 

 

 

 

Chapter 4: CRUD Optimization

Aggregation Performance

 

https://www.mongodb.com/docs/manual/core/query-plans/

 

 

 

With regards to aggregation performance, which of the following are true?

- When $limit and $sort are close together a very performant top-k sort can be performed

- Transforming data in a pipeline stage prevents us from using indexes in the stages that follow

 

 

 

 

Chapter 4: CRUD Optimization

Lab 4.1: Equality, Sort, Range

 

In this lab you're going to use the equality, sort, range rule to determine which index best supports a given query.

Given the following query:

db.accounts.find( { accountBalance : { $gte : NumberDecimal(100000.00) }, city: "New York" } )

.sort( { lastName: 1, firstName: 1 } )

Which of the following indexes best supports this query with regards to the equality, sort, range rule.

-> { city: 1, lastName: 1, firstName: 1, accountBalance: 1 }

 

 

 

 

Chapter 4: CRUD Optimization

Lab 4.2: Aggregation Performance

 

For this lab, you're going to create an index so that the following aggregation query can be executed successfully.

We assume you have imported the restaurants dataset in your cluster. If not, it is attached to this lesson, so you can import it following the instructions in the Install Course Tools and Datasets lesson in Chapter 1.

Before you work on your solution, delete all indexes for the restaurants dataset:

db.restaurants.dropIndexes()

Then, run the following query and you will receive an error.

db.restaurants.aggregate([

{ $match: { stars: { $gt: 2 } } },

{ $sort: { stars: 1 } },

{ $group: { _id: "$cuisine", count: { $sum: 1 } } }

])

{

"ok": 0,

"errmsg": "Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting. Aborting operation. Pass allowDiskUse:true to opt in.",

"code": 16819,

"codeName": "Location16819"

}

The task of this lab is to identify why this error is occuring and build an index to resolve the issue.

Keep in mind that there might be several indexes that resolve this error, but we're looking for an index with the smallest number of fields that resolves the error and services this command.

In the text box below, submit the index that resolves the issue.

For example, if you ran:

db.restaurants.createIndex( { foobar: 1 } )

to create the index that fixes the error, then you would enter the following document which lists the fields indexed into the text box:

{ foobar: 1 }

-> { stars: 1 }

 

 

 

Chapter 5: Performance on Clusters

Performance Considerations in Distributed Systems Part 2

 

 

https://www.mongodb.com/docs/manual/core/distributed-queries/?jmp=university 

 

https://www.mongodb.com/docs/manual/core/sharded-cluster-query-router/#how-mongos-handles-query-modifiers

 

 

 

From a performance standpoint, when working with a distributed database it's important to consider...

- Latency

- Routed Queries

 

 

 

Chapter 5: Performance on Clusters

Increasing Write Performance with Sharding Part 1

 

 

Chapter 5: Performance on Clusters

Increasing Write Performance with Sharding Part 2

 

https://www.mongodb.com/docs/manual/core/distributed-queries/

 

https://www.mongodb.com/docs/manual/core/bulk-write-operations/?jmp=university 

 

 

 

Which of the following is/are true?

- Picking a good shard key is one of the most important parts of sharding.

 

 

 

Chapter 5: Performance on Clusters

Reading from Secondaries

 

https://www.mongodb.com/docs/manual/core/read-preference/?jmp=university 

 

 

When should you ever read from secondaries?

- To provide reads with lower latency.

- When doing ad-hoc queries and analytic jobs.

 

 

 

Chapter 5: Performance on Clusters

Replica Sets with Differing Indexes Part 1

 

 

 

Chapter 5: Performance on Clusters

Replica Sets with Differing Indexes Part 2

 

 

Chapter 5: Performance on Clusters

Replica Sets with Differing Indexes Part 3

 

 

Which of the following conditions apply when creating indexes on secondaries?

- A secondary should never be allowed to become primary

 

 

 

 

Chapter 5: Performance on Clusters

Aggregation Pipeline on a Sharded Cluster

 

 

https://www.mongodb.com/docs/manual/core/aggregation-pipeline-sharded-collections/?jmp=university

 

 

What operators will cause a merge stage on the primary shard for a database?

- $out

- $lookup

 

 

 

 

Final Exam

Question 1

 

 

Which of these statements is/are true?

- Creating an ascending index on a monotonically increasing value creates index keys on the right-hand side of the index tree.

 

 

 

Final Exam

Question 2

 

Which of the following statements is/are true?

- It's important to ensure that your shard key has high cardinality.

- Indexes can decrease insert throughput.

- It's important to ensure that secondaries with indexes that differ from the primary not be eligible to become primary.

- Indexes can be walked backwards by inverting their keys in a sort predicate.

- Partial indexes can be used to reduce the size requirements of the indexes.

 

 

 

Final Exam

Question 3

 

Which of the following statements is/are true?

- Collations can be used to create case insensitive indexes.

- By default, all MongoDB user-created collections have an _id index.

- It's common practice to co-locate your mongos on the same machine as your application to reduce latency.

 

 

 

Final Exam

Question 4

 

Which of the following statements is/are true?

- Indexes can solve the problem of slow queries.

- Indexes are fast to search because they're ordered such that you can find target values with few comparisons.

 

 

 

Final Exam

Question 5

 

Which of the following statements is/are true?

- Compound indexes can service queries that filter on a prefix of the index keys.

- Query plans are removed from the plan cache on index creation, destruction, or server restart.

- If no indexes can be used then a collection scan will be necessary.

 

 

 

Final Exam

Question 6

 

Which of the following statements is/are true?

- The ideal ratio between nReturned and totalKeysExamined is 1.

- You can use the --wiredTigerDirectoryForIndexes option to place your indexes on a different disk than your data.

- An index doesn't become multikey until a document is inserted that has an array value.

 

 

 

Final Exam

Question 7

 

Given the following indexes:

 

{ categories: 1, price: 1 }

{ in_stock: 1, price: 1, name: 1 }

The following documents:

 

{ price: 2.99, name: "Soap", in_stock: true, categories: ['Beauty', 'Personal Care'] }

{ price: 7.99, name: "Knife", in_stock: false, categories: ['Outdoors'] }

And the following queries:

 

db.products.find({ in_stock: true, price: { $gt: 1, $lt: 5 } }).sort({ name: 1 })

db.products.find({ in_stock: true })

db.products.find({ categories: 'Beauty' }).sort({ price: 1 })

Which of the following is/are true?

 

- Index #1 would provide a sort to query #3.

- Index #2 can be used by both query #1 and #2.

 

 

 

 

728x90

댓글