How to do JOIN on MongoDB collections?

Jerry Zhang
3 min readOct 12, 2021

One of the biggest differences between SQL and NoSQL databases is JOIN. Although MongoDB introduced $lookup as a supplement to the use of relational data in NoSQL databases, even if it is a simple two-table association, $lookup needs to be replaced with complex aggregate queries. The actual application environment is complicated. When encountering a lot of data Layer nesting and multi-table association, $lookup is difficult to solve. If you use Open-esProc, first simple query through MongoDB, and then combined with SPL syntax (SQL-like calculation), not only can complete all the functions of SQL, but also particularly good at processing multi-layer data. For example, the data of the two related tables are as follows.



The child_id in the History collection is associated with the in the childsgroup collection, and hoping to get the following results:

“_id” : ObjectId(“5bab2ae8ab2f1bdb4f434bc3”),
“id” : “001”,
“history” : “today worked”,
“child_id” : “ch001”,
“childInfo” :
“name” : “a”,
“mobile” : 1111

{$lookup: {
from: "childsgroup",
let: {child_id: "$child_id"},
pipeline: [
{$match: { $expr: { $in: [ "$$child_id", "$"] } } },
{$unwind: "$childs"},
{$match: { $expr: { $eq: [ "$", "$$child_id"] } } },
{$replaceRoot: { newRoot: "$"} }
as: "childInfo"
{"$unwind": "$childInfo"}

This script uses several functions lookup, pipeline, match, unwind, and replaceRoot to process. It is not easy for MongoDB users to write such complex scripts. If you use SPL scripts to implement:

Association query results:

Script description:
A1: Connect to the MongoDB database.
A2: Get the data in the history collection.
A3: Get the data in the childsgroup collection.
A4: Extract the childs data in the childsgroup and merge it into a table sequence.
A5: The child_id in the history table is associative to the id in the childs table, add the info field, and return the table sequence.
A6: Close the database connection.

Compared with MongoDB script writing, the difficulty of SPL script is reduced a lot, and the thinking is clearer. There is no need to familiarize yourself with the usage of MongoDB functions and how to combine and process data, which saves a lot of time.

MongoDB provides $lookup to achieve basic support for multi-table association. However, in the face of some more complex association situations, query scripts are often too complicated. If you use Open-esProc SPL scripts, use its powerful syntax and good ease of use. It happens to make up for the shortcomings of MongoDB in this regard. If you want to know more examples of correlation calculations, you can refer to Simplifying MongoDB Data Association

After the SPL script performs correlation calculations on MongoDB data, the results can also be easily used in java applications. SPL has a dedicated JDBC driver. SPL scripts are called through JDBC. For details, please refer to How to perform SQL-like queries on MongoDB in Java?



Jerry Zhang

Products and resources that simplify hard data processing tasks. If you have any questions, send me a message.