Skip to content
Advertisement

Mongo DB aggregate $lookup slow add index to all documents?

I’ve got two collections in my Mongo DB database. I’m quite new to Mongo DB in general. Each of my collection says that there’s 1 index, now, coming from a Laravel and SQL database where I can improve performance by adding an index with ->index() on my migration for my columns, I assume there’s a way to do something similar for my Mongo DB documents and the key/value fields.

I’ve got two collections:

  • data_source_one (# of documents: 5,300, total doc size: 1.2 MB)
  • data_source_two (# of documents: 6,800, total doc size: 139.8 MB)

I’m using the $lookup (aggregation) to effectively do a join on my two tables based on a common field, but unlike a traditional SQL database, it’s taking well over 25 seconds to complete the request.

I’m wondering how I can essentially improve the performance by adding an index to all of my documents in each collection to my created_at key (custom), and other fields?

const client = new MongoClient(process.env.DB_CONNECTION)
await client.connect()

const results = await client.db().collection('data_source_one').aggregate([{
  $lookup: {
    from: 'data_source_two',
    localField: 'created_at',
    foreignField: 'created_at',
    as: 'combined_results'
  }
}]).toArray();

Advertisement

Answer

Yes, you can use indexes on specific fields to achieve a more efficient execution. MongoDB uses indexes to perform efficient querying of its collections. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect. You can check the documentation here.

In your case, if you are joining two tables based on a common field, you can add an index to that field to achieve faster execution. You can check the documentation on optimizing your aggregation pipeline here. However, it probably still won’t be as fast as the JOIN statements in SQL.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement