How to bufferize efficiently in nodeJS on events from a stream to bulk insert instead of unique insert per record received from the stream. Here’s pseudo code I’ve got in mind:
// Open MongoDB connection mystream.on('data', (record) => { // bufferize data into an array // if the buffer is full (1000 records) // bulk insert into MongoDB and empty buffer }) mystream.on('end', () => { // close connection })
Does this look realistic? Is there any possible optimization? Existing libraries facilitaties that?
Advertisement
Answer
I ended up with a no dependency solution.
const { MongoClient } = require("mongodb") const url = process.env.MONGO_URI || "mongodb://localhost:27019"; const connection = MongoClient.connect(url, { useNewUrlParser: true, useUnifiedTopology: true }) Promise.resolve(connection) .then((db) => { const dbName = "databaseName"; const collection = 'collection'; const dbo = db.db(dbName); let buffer = [] stream.on("data", (row: any) => { buffer.push(row) if (buffer.length > 10000) { dbo.collection(collection).insertMany(buffer, {ordered: false}); buffer = [] } }); stream.on("end", () => { // insert last chunk dbo.collection(collection).insertMany(buffer, {ordered: false}) .then(() => { console.log("Done!"); db.close(); }) }); stream.on("error", (err) => console.log(err)); }) .catch((err) => { console.log(err) })