From Boom to Bundle: The Great Consolidation of Data Tools
Hi, fellow future and current Data Leaders; Ben here đ
Today, we will dig into where the data world is going in 2025. The consolidation that many of us have been discussing for the past year or so is finally kicking off!
Before we dive into the article, I wanted to share a bit about Estuary, an ETL/ELT platform I've used to help make clients' data workflows easier. Estuary helps teams easily move data in real-time or on a schedule, from databases and SaaS apps to data lakes and warehouses, empowering data leaders to focus on strategy and impact rather than getting bogged down by infrastructure challenges. Theyâve been a great partner to work with and I have really enjoyed also advising for them. If you want to simplify your data workflows, check them out today.
With that out of the way, letâs jump into it!
Intro
One of the defining challenges of working in the data industry is the sheer volume of tools and technologies available. Over the past decade, as interest in big data and data science surged, investment in data solutions followed. We went from venture capitalists (VCs) backing companies like Cloudera and Hortonworks to modern tools like dbt and Fivetran.
This rapid expansion has made the ecosystem more crowded every year. A great example is the MAD (Machine Learning, Artificial Intelligence, and Data) Landscape that Matt Turck and his team put out annually. Each iteration of this landscape is a testament to the growth and complexity of the space, where identifying a single logo among hundreds becomes difficult.
Lately, though, Iâve been hearing more and more people in the data space talk about consolidation. Thereâs only so much room in the market: a limited number of customers, talent, and demand for these tools. And with the end of zero-interest rate policies (ZIRP), itâs getting hard to secure funding for future rounds.
On top of that, the new U.S. administration seems more open to mergers and acquisitions (M&A), which means we could see a wave of consolidation in the next couple of years.
I wouldnât be surprised if companies like Meta and Amazon started buying up some GenAI companies. Itâs already happening in the startup worldâjust look at the recent acquisitions of SDF Labs, Upsolver, Rivery, and Quary. If this is any indication of whatâs to come, it looks like weâre off to a strong start this year.
In this article, weâll discuss these acquisitions, what they mean for the industry, and what other experts are saying about this growing trend.
Whatâs the Impact of Consolidation?
When companies merge, itâs easy to see both the good and bad outcomes. After all, some acquisitions are major winsâlike Instagramâwhile others fade into obscurity, much like Mint.
To dig into this, I wanted to share some insights from a data leader who preferred to stay anonymous. Once you hear their perspective, youâll understand why.
âI have mixed emotions on the latest acquisitions. Tool consolidation makes sense, and itâs not unusual for a startup to get acquired by a large company. It's also much better than startups going out of business. So, I am happy for friends who work at these companies.
As a customer, Iâm cautious. Looker is a great example of an acquisition that didn't work out so well for some customers. If youâre not on BigQuery, youâre stuck with no new features and very poor support. And when you ask for something, they aggressively push you to move to BigQuery.
For Upsolver <> Qlik, the first thought I had was "oh no." Qlik has never been on my list of tools to evaluate. And from what Iâve heard from others, it's expensive and not all that great. They also already have Talend, and I believe they bought an EL company a while back. Will Upsolver thrive there? I am not so sure.â
Lookerâs mention is an important oneâitâs a tool that used to feel like a real competitor to Tableau. But now? Anecdotally, it seems to have lost some of its shine.
With all that in mind, let's dive into these recent acquisitions and break down what they might mean for the data space and its tools.
Cementing dbt vs. SQLMesh
Itâs starting to feel like the data transform space is boiling down to a showdown between dbt vs. SQLMeshâat least for tools that focus on the âTâ in ELT/ETLs.
Sure, there are still plenty of ETL tools out thereâlike Informatica and Matillionâthat developers can choose from. But when it comes to SQL first-based choices, then, youâre likely picking between dbt and SQLMesh.
And congrats to SQLMesh for making wavesâapparently, theyâve ruffled enough feathers to get banned from dbtâs conference!
To get some perspective, I reached out to several individuals who spend a lot of time in the data space for their initial thoughts on dbtâs purchase of SDF Labs.
âThis is a good move by dbt. I donât know much about SDF, but it understands SQL and is much faster, which is something dbt lacks today. I suppose dbt wanted it to do more but didn't have the foundation to make it happen, hence the acquisition. I wonder why SDF folks didn't want to remain as competitors.
One interesting comment I saw yesterday was that some of the new, awesome features being integrated into dbt might end up behind a paywall. Thatâs a valid concern. I like how SQLMesh offers most of its features as open source, and I thinkâor hopeâthat dbt will follow that approach moving forward.
In short, I'm excited to see what dbt will bring to the table in the coming years and how SQLMesh will compete with them.â - Yuki Kakegawa - Independent Data Consultant at Orem Data
What's missing from dbt today?
Catching syntax and data type errors before sending the query to the warehouse (we're spending compute dollars just to find out a query doesn't actually run).
The ability to run and test models locally.
Linting and formatting. Usually, SQL Fluff is tacked on to dbt, but its performance is very slow, causing longer CI runs.
Column and Table name auto-complete in IDEs outside of dbt Cloud.
Faster compile times on large projects.
Where does dbt outshine SDF?
The ability to self-host or share dbt Docs: I web UI to navigate through lineage and documentation.
Out-of-the-box Unit Test.
Type 2 Slowly Changing Dimensions (dbt Snapshots).
Community
Jeff Skoldberg - Principal Consultant, Data Architecture and Analytics Green Mountain Data Solutions, LLC
âData engineering teams are continuing to consolidate around the use of SQL as their lingua franca, and I think it will be tough for anyone to go against that trend. dbt's acquisition of the SDF team is further proof, as they build out their SQL-centric capabilities for dbt users annoyed with Jinja and Python for templating.â- Ryan Wexler Principal SignalFire
One point Iâd like to make about the dbt and SDF Labs acquisition is that weâve now consolidated it to one of the two: dbt or SQLMesh when it comes to picking a SQL first transform layer. Again, there are other options, but the focus both have on being SQL focused and, more importantly, the fact that they make it easier to approach a multi-engine data stack combined with the recent focus on Iceberg is shaping up the data world to be less restrictive.
This competition is healthy, and Iâm excited to see how each solution evolves for the user in 2025.
Rivery And Upsolver - Everyone Wants Change Data Capture
Before the end of 2024, it was announced that Boomi had acquired Rivery. I genuinely thought, wow, I have only ever seen Boomi once on another consultant's list of tools they use. In all fairness, this is likely due to the fact that Boomi is positioned more toward IPaas vs analytical data workflows. Also, I had no idea that Dell once owned Boomi, which was only recently purchased by Francisco Partners and TPG Capital in 2021.
Now, Upsolver has been bought out by Qlik. While both tools share similaritiesâlike having an ingestion component, including CDCâthere are some differences. Rivery has been heavily centered on data integration, transformation, and orchestration, whereas Upsolver had started branching into the managed Iceberg space.
Side note: If youâre looking for a tool to ingest data into your data warehouse or lakehouse, let me quickly plug Estuaryâitâs a great option whether youâre loading data into Iceberg, Snowflake, or another platform.
So, what do other data experts think of these acquisitions?
Well, Lindsay Murphy, Director of Data at Hiive, provided her take on Qlik's acquisition.
âThe thing I'd say about Qlik is theyâre a bit like Googleâthey buy a bunch of companies, and then those products collect dust (e.g., Talend bought Stitch⌠Qlik bought Talend⌠Stitch hasn't had many new features in like 10 years).â
She also noted:
âQlik has made 18 acquisitions so far, and it seems like they average about one a year. These acquisitions in close succession might signal the start of "the great consolidation" of MDS tools people have been talking about for a few years now.â
That said, if Qlik can successfully integrate Upsolverâespecially with its Iceberg featuresâit could help Qlik remain competitive. After all, Qlik's been around since 1993, navigating decades in a turbulent industry. Not every acquisition theyâve made has been a home run, but theyâve managed to stay in the game, and I know more than one Qlik consultant who would vouch for their staying power. So although I do agree there is a good chance Upsolver could simply become another forgotten acquisition, it makes sense why Qlik purchased them.
On the other hand, Boomi doesnât have the same history of acquisitionsâŚ.Yet!
Prior to Rivery, they only acquired two other companies. Itâs also nearly doubled in size in the last five years in terms of a number of employees and was foreshadowing this M&A since last year when its CEO stated âWe are in absolute predator mode,â when he was referencing the fact that:
Falling valuations precipitated by the shift of venture capital funding to generative AI, he added, âcreates incredibly compelling acquisition opportunities. Weâre going to be doing some M&A in the next couple of months.â - Boomi CEO: âWe are in absolute predator modeâ
In the end, many of us were discussing the very frothy valuations of most data companies back in the early 2020s, and now, there will be a fair bit of haircuts to valuations and buying opportunities for companies who have the capital.
Where is the Rest of the Year Going?
As mentioned earlier, I believe weâll finally start to see more startups being acquired over the next year or two. Many of these companies received funding between 2018 and 2022, but since then, itâs been a matter of sink or swim. While some startups are doing fine, others would benefit from being acquired or merging to offer a fuller suite of features for their end-users.
Yuki shared another interesting thought on this:
âI think there may be some changes in the BI realm. There are newer tools like Sigma, Evidence, Hashboard, Omni, etc. I could see one acquiring another to supplement a lack of functionality or user base. Perhaps even something like Snowflake acquiring Sigma to compete with Databricks's new BI offerings.â
The BI space has always been challenging. Despite the emergence of new tools, Tableau and Power BI continue to dominate the market, making it tough for others to gain significant traction.
I could also see companies like Fivetran making moves to acquire other data ingestion toolsâor even going the route of building out their own all-in-one solution.
Now that I think about it, perhaps Fivetran might look into buying out SQLMesh. After all, they are already using SQLMesh as their transform layer according to the customer list on SQLMeshâs website.
This kind of acquisition would not only help Fivetran create a more end-to-end offering but also improve its stickiness in the market. As Snowflake's direct integrations start eating into Fivetranâs ingestion market share, expanding their capabilities might be a smart play to stay competitive.
Closing Thoughts
As Lindsay pointed out, these recent acquisitions feel like the early signals of a larger consolidation trend. What Iâve learned, though, is that these cycles take years to fully play outânot just a single year. Many of us started discussing this back in late 2023 or early 2024, and itâs clear weâre only at the beginning.
Hereâs why I believe consolidation is ramping up:
Clients are increasingly asking for bundled solutions that simplify their tech stack.
VCs are no longer looking to fund every data solution under the sun. There are far fewer data infrastructure companies getting $50 million funding rounds without clear traction, though they still happen occasionally.
The political climate around mergers is shifting. Larger companies like Google, Facebook, and Amazon may now have more freedom to acquire companies without the fear of regulatory pushback from leaders like Lina Khan.
That said, consolidation isnât all sunshine and rainbows. While itâs exciting to watch these acquisitions unfold, itâs also tough to think about the companies that wonât make it. Some will run out of funding, and there wonât always be a bigger company willing to buy them out. That part of the story is hard not to feel a little down about.
But I donât want to end on too somber a note. Iâd love to hear your thoughts: Which companies do you think will merge next? What acquisitions would you like to see?
As always, thanks for reading!
Video Of The Week - Don't Lead A Data Team Before Watching This - 5 Lessons You Need To Know As A Head of Data
Join My Data Engineering And Data Science Discord
If youâre looking to talk more about data engineering, data science, breaking into your first job, and finding other like minded data specialists. Then you should join the Seattle Data Guy discord!
We are now over 8000 members!
Join My Technical Consultants Community
If youâre a data consultant or considering becoming one then you should join the Technical Freelancer Community! We have over 1500 members!
Youâll find plenty of free resources you can access to expedite your journey as a technical consultant as well as be able to talk to other consultants about questions you may have!
Articles Worth Reading
There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!
Netflix Introducing Configurable Metaflow
A month ago at QConSF, we showcased how Netflix utilizes Metaflow to power a diverse set of ML and AI use cases, managing thousands of unique Metaflow flows. This followed a previous blog on the same topic. Many of these projects are under constant development by dedicated teams with their own business goals and development best practices, such as the system that supports our content decision makers, or the system that ranks which language subtitles are most valuable for a specific piece of content.
Holy Grails of Data: Self-Service, Single Truths, and the Role of AI
For decades, data teams have been chasing multiple ever-elusive Holy Grails. These end goals get talked about heavily, but it feels as if very few data teams have indeed seen them.
And if they have, it was always at some other job.
The data world has many of them. Some are real, some aren't, and many are used by marketing and sales teams to sell the dream. In fact, when I first came into the data world, Tableau was selling self-service analytics hard. Of course, being new to the world, it makes sense; if end-users can access their data, they'll ask you fewer questions, right?
WellâŚsort of.
End Of Day 159
Thanks for checking out our community. We put out 3-4 Newsletters a week discussing data, tech, and start-ups.







Great graphics!
am always invincible