Synchronize Your Mobile Application With MongoDB Atlas and Google Cloud MySQL

Timothy Marland6 min read • Published Feb 08, 2024 • Updated Feb 08, 2024

Google Cloud Mobile Kafka Atlas

Rate this tutorial

Enterprises around the world are looking to modernize their existing applications. They need a streamlined way to synchronize data from devices at the Edge into their cloud data stores. Whether their goals are business growth or fending off the competition, application modernization is the primary vehicle that will help them get there.

Often the first step in this process is to move data from an existing relational database repository (like Oracle, SQL Server, DB2, or Postgres, for example) into a JSON-based flexible database in the cloud (like MongoDB, Aerospike, Couchbase, Cassandra, or DocumentDB). Sounds simple, right? I mean, really, if JSON (NoSQL) is so simple and flexible, why would data migration be hard? There must be a bunch of automated tools to facilitate this data migration, right?

Unfortunately, the answers are “Not really,” “Because data synchronization is rarely simple,” and “The available tools are often DIY-based and don’t provide nearly the level of automation required to facilitate an ongoing, large-scale, production-quality, conflict-resolved data synchronization.”

Why is this so complex?

Data modeling

One of the first challenges is data modeling. To effectively leverage the benefits inherent in a JSON-based schema, you need to include data modeling as part of your migration strategy. Simply flattening or de-normalizing a relational schema into nested JSON structures, or worse yet, simply moving from relational to JSON without any data modeling consideration, results in a JSON data repository that is slow, inefficient, and difficult to query. You need an intelligent data modeling platform that automatically creates the most effective JSON structures based on your application needs and the target JSON repository without requiring specialized resources like data scientists and data engineers.

Building and monitoring pipelines

Once you’ve mapped the data, you need tools that allow you to build reliable, scalable data pipelines to move the data from the source to the target repository. Sadly, most of the tools available today are primarily DIY scripting tools that require both custom (often complex) coding to transform the data to the new schema properly and custom (often complex) monitoring to ensure that the new data pipelines are working reliably. You need a data pipeline automation and monitoring platform to move the data and ensure its quality.

DIY is hard

This process of data synchronization, pipeline automation, and monitoring is where most application modernization projects get bogged down and/or ultimately fail. These failed projects often consume significant resources before they fail, as well as affect the overall business functionality and outcomes, and lead to missed objectives.

CDC: MongoDB Atlas, Atlas Device Sync, and Dataworkz

Synchronizing data between edge devices and various databases can be complex. Simplifying this is our goal, and we will demonstrate how to achieve bi-directional synchronization between mobile devices at MySQL in the cloud using MongoDB Atlas Device Sync and Dataworkz.

Let's dive in.

Prerequisites

Accounts with MongoDB Atlas (this can be tested on free tiers), Dataworkz, and Google Cloud
Kafka
Debezium

Step 1: prepare your mobile application with Atlas Device Sync

Set up a template app for this test by following the steps outlined in the docs. Once that step is complete, you will have a mobile application running locally, with automated synchronization back to MongoDB Atlas using the Atlas Device Sync SDK.

Step 2: set up a source database and target MongoDB Atlas Collection

We used GCP in us-west1-a and Cloud MySQL for this example. Be sure to include sample data.

Check if BinLog replication is already enabled

Access your Google Cloud MySQL instance.
In the left navigation pane, under Databases, select Backups.
Access your Google Cloud MySQL instance.
In Settings, check the status of Automated backups.
If the status is Disabled, follow the steps below.

Enable BinLog replication

These steps are necessary to set up some of the background processes which enable ongoing two-way synchronization.

In Backups, Settings, click + CREATE BACKUP.
In the Edit backup settings dialog:
- Select the Automate backups and Enable point-in-time recovery check boxes.
- Click SAVE to activate the changes.

Check the status of Automated backups again under Settings. It should appear as shown below. This step is important because the metadata created in the backup process is used by the tools to help enable bi-directional synchronization.

Step 3: configure connector for MySQL

The connector for MySQL needs to be defined in Dataworkz.

Log in to the Dataworkz application.
Go to Configuration -> Databases -> MySQL.
Click the + icon in the above image to add the configuration.
It will open the MySQL configuration screen.

Provide the following details in the configuration screen:
- MySQL connector name
- Host server and port
- User name
- Password
- Select Deployment Type
- Select TLS/SSL option as appropriate
Click Save to add the MySQL configuration.

Step 4: configure connector for MongoDB

The connector for MongoDB Atlas needs to be defined in Dataworkz.

Log in to the Dataworkz application.
Go to Configuration -> Databases -> MongoDB.
Click the + icon in the above image to add the configuration.
It will open the MongoDB configuration screen.

Provide the following details in the configuration screen (these can be accessed by clicking on the Connect button displayed below).
- MongoDB Connector Name
- Host server
- User name
- Password

Click Save to add the MongoDB configuration.

Step 5: set up Kafka

The connector for Kafka needs to be defined in Dataworkz.

Log in to the Dataworkz application.
Go to Configuration -> Streaming Systems -> Kafka.
Click the + icon in the above image to add the configuration.
It will open the Kafka configuration screen.

Provide the following details in the configuration screen:
- Kafka Connector Name
- Bootstrap server name and port (as per the format in the image below)
- Zookeeper (optional)
Click Save to add the Kafka configuration.

Step 6: configure CDC for MySQL

Configure CDC for MySQL in Dataworkz by following these steps:

Log in to the Dataworkz application.
Go to Configuration -> Change Data Capture -> MySQL.
Click the + icon in the above image to add the configuration.
It will open the CDC configuration screen.

Provide the details in the configuration detail screen:
- Choose the Workspace.
- Choose Kafka (which was configured in the previous step).
- Add the Debezium server name.
- Choose the MYSQL Storage that was previously configured.
Select the appropriate database from the list.
Choose the Database table for Sync.

You can see the configured MySQL CDC in the list of configurations.

Step 7: data preparation — JSON modeling

This step enables the modeling of the results of RDBMS in JSON structure. To do this, select Data Prep -> JSON Modeller inside the Dataworkz settings.

Click on Select root dataset for json modeling (above image).

Select the Workspace in which MySQL CDC was configured.
Select cdc_kafka as the Source Collection.
Choose root/base table/dataset.
Click on the Enable Bi-directional sync checkbox.
Select a sync field from the list of the columns that will be used to determine the last updated time for a row.
Select columns for the target dataset.
Click Update and you will see the below screen.

Choose the desired mapping with the base table by right-clicking.
- If the table has one-to-many mapping, choose Create array document.
- If the table has one-to-one mapping, choose Create field in child document
Selecting create array document would require selecting a child table relationship (for creation of an array document).
Choose Child Table in the screen that pops up.

Choose join key in the Relationship mapping section.
In the Unique Key section, choose the unique key for the child table (used for update or delete events).
Select a sync field from the list of the columns that will be used to determine the last updated time for a row.
Click update.
A one-to-many relationship shows up.

Click on Save and choose the target destination (MongoDB Collection, in this case).

Choose Transfer Mode as Upsert and select the unique key for the base table in the pop-up box.

Choose the desired job name and select the job frequency as Recurring. Set this job as Continuous from the Choose Frequency section.

Step 8: monitoring/maintaining the sync job

The sync job that was scheduled in the previous step can be viewed under Jobs -> Continuous Job” in the Dataworkz portal. It provides the following:

job status (running/deleted/paused)
ability to pause a running job
ability to resume a paused job
ability to delete a continuous job

Conclusion

MongoDB Atlas Device Sync streamlines and automates the challenges of mobile data synchronization with Atlas. By integrating with Dataworkz, that same data can now be automatically and bi-directionally synchronized with MySQL, Oracle and other databases.

Set up your account at cloud.mongodb.com and dataworkz.com to create accounts and begin your automated bi-directional data synchronization journey.

Rate this tutorial

Tutorial

How to Get MongoDB Data into Parquet in 10 Seconds or Less

Jun 28, 2023 | 5 min read

Tutorial

How to Build a RAG System Using Claude 3 Opus And MongoDB

Mar 07, 2024 | 16 min read

Article

Atlas Search Playground: Easy Experimentation

May 29, 2024 | 7 min read

News & Announcements

Unlock the Value of Data in MongoDB Atlas with the Intelligent Analytics of Microsoft Fabric

Nov 17, 2023 | 6 min read

Why is this so complex?
CDC: MongoDB Atlas, Atlas Device Sync, and Dataworkz
Prerequisites
Step 1: prepare your mobile application with Atlas Device Sync
Step 2: set up a source database and target MongoDB Atlas Collection
Conclusion

Atlas