Architectures of a Modern Data Platform

13 min read

We live in an era of data. Data is in every organization’s strategy, every engineer’s job description, and every CIO’s dreams (or nightmares). Day in, day out, more data collectors and more data generators are being built. The collectors are observability tools and data platforms, and the generators are users of applications and websites, IoT devices and sensors, and diverse levels of infrastructure.

This presents a challenge. The data itself presents untold opportunities and insights, but organizations consuming this data struggle to ingest, process, and query it in its vast volume. Data velocity is also an increasing problem for organizations, as they may not have the platforms or infrastructure in place to handle the data at the speed it’s generated or streamed.

Whilst many software solutions exist to address the challenges posed by such quantities of data, this in itself poses one of two challenges. One challenge is that the existing data platform uses legacy architectures or technologies, and is unable to scale, patch, and accommodate new requirements effectively. The other chal...

SQL on Kafka with Presto (Video)

Presto is a state of the art Distributed SQL Query Engine for BigData, enabling efficient querying on cold data and various data sources. With extended SQL language and features like geospatial queries, joins between different data sources (SQL to join data from HDFS, Elasticsearch, and Kafka anyone...

Showing 10 posts out of 134 total, page 1

Next page