Archive.fm

The Data Stack Show

186: Data Fusion and The Future Of Specialized Databases with Andrew Lamb of InfluxData

This week on The Data Stack Show, Eric and Kostas chat with Andrew Lamb, a Staff Engineer at InfluxData. During the episode, Andrew takes us on a deep dive into the intricacies of time series databases and the evolution of data systems. He discusses the specialized challenges of managing high cardinality data and the trade-offs in query performance. The conversation also touches on the development of Data Fusion, its adaptation for time series data, and the potential for innovation in the query language space. The episode concludes with a look at the future of data tooling and the exciting possibilities that arise from removing traditional constraints in database architecture with each person expressing enthusiasm for the role of projects like Data Fusion in shaping the landscape of data systems. Don’t miss this episode!

Broadcast on:
24 Apr 2024

Highlights from this week’s conversation include:

  • The Evolution of Data Systems (0:47)
  • The Role of Open Source Software (2:39)
  • Challenges of Time Series Data (6:38)
  • Architecting InfluxDB (9:34)
  • High Cardinality Concepts (11:36)
  • Trade-Offs in Time Series Databases (15:35)
  • High Cardinality Data (18:24)
  • Evolution to InfluxDB 3.0 (21:06)
  • Modern Data Stack (23:04)
  • Evolution of Database Systems (29:48)
  • InfluxDB Re-Architecture (33:14)
  • Building an Analytic System with Data Fusion (37:33)
  • Challenges of Mapping Time Series Data into Relational Model (44:55)
  • Adoption and Future of Data Fusion (46:51)
  • Externalized Joins and Technical Challenges (51:11)
  • Exciting Opportunities in Data Tooling (55:20)
  • Emergence of New Architectures (56:35)
  • Final thoughts and takeaways (57:47)

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.