BlogAnnounced at MongoDB.local NYC 2024: A recap of all announcements and updatesLearn more >>
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

Synchronize Your Mobile Application With MongoDB Atlas and Google Cloud MySQL

Timothy Marland6 min read • Published Feb 08, 2024 • Updated Feb 08, 2024
Google CloudMobileKafkaAtlas
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Enterprises around the world are looking to modernize their existing applications. They need a streamlined way to synchronize data from devices at the Edge into their cloud data stores. Whether their goals are business growth or fending off the competition, application modernization is the primary vehicle that will help them get there.
Often the first step in this process is to move data from an existing relational database repository (like Oracle, SQL Server, DB2, or Postgres, for example) into a JSON-based flexible database in the cloud (like MongoDB, Aerospike, Couchbase, Cassandra, or DocumentDB). Sounds simple, right? I mean, really, if JSON (NoSQL) is so simple and flexible, why would data migration be hard? There must be a bunch of automated tools to facilitate this data migration, right?
Unfortunately, the answers are “Not really,” “Because data synchronization is rarely simple,” and “The available tools are often DIY-based and don’t provide nearly the level of automation required to facilitate an ongoing, large-scale, production-quality, conflict-resolved data synchronization.”

Why is this so complex?

Data modeling

One of the first challenges is data modeling. To effectively leverage the benefits inherent in a JSON-based schema, you need to include data modeling as part of your migration strategy. Simply flattening or de-normalizing a relational schema into nested JSON structures, or worse yet, simply moving from relational to JSON without any data modeling consideration, results in a JSON data repository that is slow, inefficient, and difficult to query. You need an intelligent data modeling platform that automatically creates the most effective JSON structures based on your application needs and the target JSON repository without requiring specialized resources like data scientists and data engineers.

Building and monitoring pipelines

Once you’ve mapped the data, you need tools that allow you to build reliable, scalable data pipelines to move the data from the source to the target repository. Sadly, most of the tools available today are primarily DIY scripting tools that require both custom (often complex) coding to transform the data to the new schema properly and custom (often complex) monitoring to ensure that the new data pipelines are working reliably. You need a data pipeline automation and monitoring platform to move the data and ensure its quality.

DIY is hard

This process of data synchronization, pipeline automation, and monitoring is where most application modernization projects get bogged down and/or ultimately fail. These failed projects often consume significant resources before they fail, as well as affect the overall business functionality and outcomes, and lead to missed objectives.

CDC: MongoDB Atlas, Atlas Device Sync, and Dataworkz

Synchronizing data between edge devices and various databases can be complex. Simplifying this is our goal, and we will demonstrate how to achieve bi-directional synchronization between mobile devices at MySQL in the cloud using MongoDB Atlas Device Sync and Dataworkz.
Let's dive in.

Prerequisites

  • Accounts with MongoDB Atlas (this can be tested on free tiers), Dataworkz, and Google Cloud
  • Kafka
  • Debezium

Step 1: prepare your mobile application with Atlas Device Sync

Set up a template app for this test by following the steps outlined in the docs. Once that step is complete, you will have a mobile application running locally, with automated synchronization back to MongoDB Atlas using the Atlas Device Sync SDK.

Step 2: set up a source database and target MongoDB Atlas Collection

We used GCP in us-west1-a and Cloud MySQL for this example. Be sure to include sample data.

Check if BinLog replication is already enabled

Check if BinLog Replication is Enabled
  1. Access your Google Cloud MySQL instance.
  2. In the left navigation pane, under Databases, select Backups.
  3. Access your Google Cloud MySQL instance.
  4. In Settings, check the status of Automated backups.
  5. If the status is Disabled, follow the steps below.

Enable BinLog replication

These steps are necessary to set up some of the background processes which enable ongoing two-way synchronization.
  1. In Backups, Settings, click + CREATE BACKUP.
  2. In the Edit backup settings dialog:
    • Select the Automate backups and Enable point-in-time recovery check boxes.
    • Click SAVE to activate the changes.
Enable Automated Backups
  1. Check the status of Automated backups again under Settings. It should appear as shown below. This step is important because the metadata created in the backup process is used by the tools to help enable bi-directional synchronization.
GCP Backup Settings

Step 3: configure connector for MySQL

The connector for MySQL needs to be defined in Dataworkz.
Configure Dataworkz for MySQL
  • Log in to the Dataworkz application.
  • Go to Configuration -> Databases -> MySQL.
  • Click the + icon in the above image to add the configuration.
  • It will open the MySQL configuration screen.
Connect to MySQL in Dataworkz
  • Provide the following details in the configuration screen:
    • MySQL connector name
    • Host server and port
    • User name
    • Password
    • Select Deployment Type
    • Select TLS/SSL option as appropriate
  • Click Save to add the MySQL configuration.

Step 4: configure connector for MongoDB

The connector for MongoDB Atlas needs to be defined in Dataworkz.
MongoDB Atlas definition in Dataworkz
  • Log in to the Dataworkz application.
  • Go to Configuration -> Databases -> MongoDB.
  • Click the + icon in the above image to add the configuration.
  • It will open the MongoDB configuration screen.
MongoDB configuration screen in Dataworkz
  • Provide the following details in the configuration screen (these can be accessed by clicking on the Connect button displayed below).
    • MongoDB Connector Name
    • Host server
    • User name
    • Password
MongoDB Atlas Connect button display on overview screen
  • Click Save to add the MongoDB configuration.

Step 5: set up Kafka

The connector for Kafka needs to be defined in Dataworkz.
Kafka connection configuration in Dataworkz
  • Log in to the Dataworkz application.
  • Go to Configuration -> Streaming Systems -> Kafka.
  • Click the + icon in the above image to add the configuration.
  • It will open the Kafka configuration screen.
Kafka connection configuration details in Dataworkz
  • Provide the following details in the configuration screen:
    • Kafka Connector Name
    • Bootstrap server name and port (as per the format in the image below)
    • Zookeeper (optional)
  • Click Save to add the Kafka configuration.

Step 6: configure CDC for MySQL

Configure CDC for MySQL in Dataworkz by following these steps:
MySQL CDC configuration in Dataworkz
  • Log in to the Dataworkz application.
  • Go to Configuration -> Change Data Capture -> MySQL.
  • Click the + icon in the above image to add the configuration.
  • It will open the CDC configuration screen.
Dataworkz CDC configuration screen
  • Provide the details in the configuration detail screen:
    • Choose the Workspace.
    • Choose Kafka (which was configured in the previous step).
    • Add the Debezium server name.
    • Choose the MYSQL Storage that was previously configured.
  • Select the appropriate database from the list.
  • Choose the Database table for Sync.
Dataworkz CDC configuration screen
You can see the configured MySQL CDC in the list of configurations.
Dataworkz MySQL CDC configuration showing up in list

Step 7: data preparation — JSON modeling

This step enables the modeling of the results of RDBMS in JSON structure. To do this, select Data Prep -> JSON Modeller inside the Dataworkz settings.
Dataworkz settings JSON modeling setup
Dataworkz settings JSON modeling detail screen
  • Click on Select root dataset for json modeling (above image).
Workspace selection screen for CDC
  • Select the Workspace in which MySQL CDC was configured.
  • Select cdc_kafka as the Source Collection.
  • Choose root/base table/dataset.
  • Click on the Enable Bi-directional sync checkbox.
  • Select a sync field from the list of the columns that will be used to determine the last updated time for a row.
  • Select columns for the target dataset.
  • Click Update and you will see the below screen.
JSON modeling screen in Dataworkz
  • Choose the desired mapping with the base table by right-clicking.
    • If the table has one-to-many mapping, choose Create array document.
    • If the table has one-to-one mapping, choose Create field in child document
  • Selecting create array document would require selecting a child table relationship (for creation of an array document).
  • Choose Child Table in the screen that pops up.
Dataworkz array document child table screen
  • Choose join key in the Relationship mapping section.
  • In the Unique Key section, choose the unique key for the child table (used for update or delete events).
  • Select a sync field from the list of the columns that will be used to determine the last updated time for a row.
  • Click update.
  • A one-to-many relationship shows up.
Dataworkz JSON model detail screen
  • Click on Save and choose the target destination (MongoDB Collection, in this case).
Dataworkz JSON Upsert and key settings
  • Choose Transfer Mode as Upsert and select the unique key for the base table in the pop-up box.
Dataworkz custom mapping settings screen
  • Choose the desired job name and select the job frequency as Recurring. Set this job as Continuous from the Choose Frequency section.
Dataworkz job frequency setitngs screen

Step 8: monitoring/maintaining the sync job

The sync job that was scheduled in the previous step can be viewed under Jobs -> Continuous Job” in the Dataworkz portal. It provides the following:
  • job status (running/deleted/paused)
  • ability to pause a running job
  • ability to resume a paused job
  • ability to delete a continuous job
Dataworkz sync job summary screen

Conclusion

MongoDB Atlas Device Sync streamlines and automates the challenges of mobile data synchronization with Atlas. By integrating with Dataworkz, that same data can now be automatically and bi-directionally synchronized with MySQL, Oracle and other databases.
Set up your account at cloud.mongodb.com and dataworkz.com to create accounts and begin your automated bi-directional data synchronization journey.

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

How to Get MongoDB Data into Parquet in 10 Seconds or Less


Jun 28, 2023 | 5 min read
Tutorial

How to Build a RAG System Using Claude 3 Opus And MongoDB


Mar 07, 2024 | 16 min read
Article

Atlas Search Playground: Easy Experimentation


May 29, 2024 | 7 min read
News & Announcements

Unlock the Value of Data in MongoDB Atlas with the Intelligent Analytics of Microsoft Fabric


Nov 17, 2023 | 6 min read
Table of Contents