Gartner Cloud DBMS Report Names MarkLogic a Visionary

MarkLogic Connector for AWS Glue Now Available on AWS Marketplace

We are excited to announce the availability of the MarkLogic Connector for AWS Glue. AWS Glue is a serverless ETL tool provided as a managed service in the AWS cloud ecosystem. When connected to MarkLogic, AWS Glue provides a simple way to build data pipelines for moving data in and out of MarkLogic using visual and code-based interfaces. To get started, subscribe to the MarkLogic Connector for AWS Glue on the AWS Marketplace.

What Is AWS Glue?

AWS Glue provides a fully-managed, serverless Apache Spark infrastructure to graphically create, run, and monitor ETL pipelines. Its graphical interface, Glue Studio, automatically generates code, saving developers time and effort from the challenges of coding and optimizing Spark jobs.

Using AWS Glue, developers can build ETL pipelines using readily-available connectors for AWS services like Aurora, RDS, S3, Redshift, Kinesis, and DynamoDB as well as third-party databases like Oracle or SnowFlake. It provides a data catalog and a rich library of out-of-the-box data transformations (like filter, joins, etc.) to easily model ETL pipelines in Glue Studio. Additionally, developers can choose to code a data pipeline in either Scala or Python.

Note that for those who want to use Apache Spark with MarkLogic but are not using AWS Glue service, we have also released a MarkLogic Connector for Apache Spark

Using AWS Glue with MarkLogic

MarkLogic customers now can easily use AWS Glue to implement Spark ETL pipelines for fast data ingestion and data export.

MarkLogic Connector for AWS Glue

High-performance Data Ingestion

The MarkLogic Connector for AWS Glue makes it simple to bulk load or stream relational and non-relational data as is into MarkLogic. Additionally, it provides the flexibility of using Glue’s data transformation capabilities to combine and transform tabular data from multiple sources into hierarchical data formats like JSON before loading into MarkLogic.

As an example, users can easily use the new Glue connector to build a batch or a change data capture pipeline to load complex data (or source entities) into MarkLogic Data Hub Service. Once loaded, Data Hub Service has the necessary capabilities to integrate source data into durable data assets for later use in operational and analytical applications.

Secure Data Sharing

The MarkLogic Connector for Glue also makes it easy to consume data from MarkLogic with complete security and governance. Users can easily build scalable data pipelines for complex analytical processing using Spark libraries (like machine learning, SQL, etc.) on clean, curated, and governed data in MarkLogic. Additionally, users can also leverage MarkLogic’s multi-model querying capabilities to securely share fit-for-purpose data with various AWS services like SageMaker, Redshift, S3, and other third-party data stores like Snowflake.

Erste Schritte

To use the MarkLogic Connector for AWS Glue, simply subscribe to the connector in the AWS marketplace. Once subscribed, the MarkLogic connector will appear in your AWS Glue studio, where users can graphically build data pipelines.

To get started, follow along with the hands-on, step-by-step tutorial. To learn more about configuring the MarkLogic Connector for AWS Glue, please check out the documentation here. AWS Glue documentation is available here.

Start a discussion

Connect with the community




Most Recent

View All

Datenmanagement: Die Autobranche sucht das Geschäft der Zukunft

Die traditionsreichen Autobauer suchen nach neuen Geschäftsmodellen. Nur wer die richtigen Daten hat, kann die Wünsche der Verbraucher analysieren und das passende Produkt zielgerichet anbieten.
Artikel lesen

Integration von verschiedenen ERP-Systemen – aber wie?

Ingetration von verschiedenen ERP Systemen kann eine Herausforderung sein. Einfacher geht es mit einer NoSQL Datenplattform.
Artikel lesen

Der digitale Wandel erfordert ein neues Enterprise Content Management

Viele Versicherungen haben mit Workflow-Prozessen und der IT-Infrastruktur zu kämpfen, wenn sie ihr wichtigstes Kapital nutzen wollen: Daten und Dokumente. Neue Ansätze sind hier gefragt.
Artikel lesen
Auf dieser Website werden Cookies verwendet.

Mit der Nutzung dieser Webseite stimmen Sie der Verwendung von Cookies gemäß der MarkLogic Datenschutzrichtlinie zu.