One of the defining challenges of working in the data industry is the sheer volume of tools and technologies available. Over the past decade, as interest in big data and data science surged, investment in data solutions followed. We went from venture capitalists (VCs) backing companies like Cloudera and Hortonworks to modern tools like dbt and Fivetran.
This rapid expansion has made the ecosystem more crowded every year. A great example is the MAD (Machine Learning, Artificial Intelligence, and Data) Landscape that Matt Turck and his team put out annually. Each iteration of this landscape is a testament to the growth and complexity of the space, where identifying a single logo among hundreds becomes difficult.
Lately, though, I’ve been hearing more and more people in the data space talk about consolidation. There’s only so much room in the market: a limited number of customers, talent, and demand for these tools. And with the end of zero-interest rate policies (ZIRP), it’s getting hard to secure funding for future rounds.
On top of that, the new U.S. administration seems more open to mergers and acquisitions (M&A), which means we could see a wave of consolidation in the next couple of years.
I wouldn’t be surprised if companies like Meta and Amazon started buying up some GenAI companies. It’s already happening in the startup world—just look at the recent acquisitions of SDF Labs, Upsolver, Rivery, and Quary. If this is any indication of what’s to come, it looks like we’re off to a strong start this year.
In this article, we’ll discuss these acquisitions, what they mean for the industry, and what other experts are saying about this growing trend.
What’s the Impact of Consolidation?
When companies merge, it’s easy to see both the good and bad outcomes. After all, some acquisitions are major wins—like Instagram—while others fade into obscurity, much like Mint.
To dig into this, I wanted to share some insights from a data leader who preferred to stay anonymous. Once you hear their perspective, you’ll understand why.
“I have mixed emotions on the latest acquisitions. Tool consolidation makes sense, and it’s not unusual for a startup to get acquired by a large company. It's also much better than startups going out of business. So, I am happy for friends who work at these companies.
As a customer, I’m cautious. Looker is a great example of an acquisition that didn't work out so well for some customers. If you’re not on BigQuery, you’re stuck with no new features and very poor support. And when you ask for something, they aggressively push you to move to BigQuery.
For Upsolver <> Qlik, the first thought I had was "oh no." Qlik has never been on my list of tools to evaluate. And from what I’ve heard from others, it's expensive and not all that great. They also already have Talend, and I believe they bought an EL company a while back. Will Upsolver thrive there? I am not so sure.”
Looker’s mention is an important one—it’s a tool that used to feel like a real competitor to Tableau. But now? Anecdotally, it seems to have lost some of its shine.
With all that in mind, let's dive into these recent acquisitions and break down what they might mean for the data space and its tools.
Cementing dbt vs. SQLMesh
It’s starting to feel like the data transform space is boiling down to a showdown between dbt vs. SQLMesh—at least for tools that focus on the “T” in ELT/ETLs.
Sure, there are still plenty of ETL tools out there—like Informatica and Matillion—that developers can choose from. But when it comes to SQL first-based choices, then, you’re likely picking between dbt and SQLMesh.
And congrats to SQLMesh for making waves—apparently, they’ve ruffled enough feathers to get banned from dbt’s conference!
To get some perspective, I reached out to several individuals who spend a lot of time in the data space for their initial thoughts on dbt’s purchase of SDF Labs.
“This is a good move by dbt. I don’t know much about SDF, but it understands SQL and is much faster, which is something dbt lacks today. I suppose dbt wanted it to do more but didn't have the foundation to make it happen, hence the acquisition. I wonder why SDF folks didn't want to remain as competitors.
One interesting comment I saw yesterday was that some of the new, awesome features being integrated into dbt might end up behind a paywall. That’s a valid concern. I like how SQLMesh offers most of its features as open source, and I think—or hope—that dbt will follow that approach moving forward.
In short, I'm excited to see what dbt will bring to the table in the coming years and how SQLMesh will compete with them.” - Yuki Kakegawa - Independent Data Consultant at Orem Data
What's missing from dbt today?
Catching syntax and data type errors before sending the query to the warehouse (we're spending compute dollars just to find out a query doesn't actually run).
The ability to run and test models locally.
Linting and formatting. Usually, SQL Fluff is tacked on to dbt, but its performance is very slow, causing longer CI runs.
Column and Table name auto-complete in IDEs outside of dbt Cloud.
Faster compile times on large projects.
Where does dbt outshine SDF?
The ability to self-host or share dbt Docs: I web UI to navigate through lineage and documentation.
Out-of-the-box Unit Test.
Type 2 Slowly Changing Dimensions (dbt Snapshots).
Community
Jeff Skoldberg - Principal Consultant, Data Architecture and Analytics Green Mountain Data Solutions, LLC
“Data engineering teams are continuing to consolidate around the use of SQL as their lingua franca, and I think it will be tough for anyone to go against that trend. dbt's acquisition of the SDF team is further proof, as they build out their SQL-centric capabilities for dbt users annoyed with Jinja and Python for templating.”- Ryan Wexler Principal SignalFire
One point I’d like to make about the dbt and SDF Labs acquisition is that we’ve now consolidated it to one of the two: dbt or SQLMesh when it comes to picking a SQL first transform layer. Again, there are other options, but the focus both have on being SQL focused and, more importantly, the fact that they make it easier to approach a multi-engine data stack combined with the recent focus on Iceberg is shaping up the data world to be less restrictive.
This competition is healthy, and I’m excited to see how each solution evolves for the user in 2025.
Rivery And Upsolver - Everyone Wants Change Data Capture
Before the end of 2024, it was announced that Boomi had acquired Rivery. I genuinely thought, wow, I have only ever seen Boomi once on another consultant's list of tools they use. In all fairness, this is likely due to the fact that Boomi is positioned more toward IPaas vs analytical data workflows. Also, I had no idea that Dell once owned Boomi, which was only recently purchased by Francisco Partners and TPG Capital in 2021.
Now, Upsolver has been bought out by Qlik. While both tools share similarities—like having an ingestion component, including CDC—there are some differences. Rivery has been heavily centered on data integration, transformation, and orchestration, whereas Upsolver had started branching into the managed Iceberg space.
Side note: If you’re looking for a tool to ingest data into your data warehouse or lakehouse, let me quickly plug Estuary—it’s a great option whether you’re loading data into Iceberg, Snowflake, or another platform.
So, what do other data experts think of these acquisitions?
Well, Lindsay Murphy, Director of Data at Hiive, provided her take on Qlik's acquisition.
“The thing I'd say about Qlik is they’re a bit like Google—they buy a bunch of companies, and then those products collect dust (e.g., Talend bought Stitch… Qlik bought Talend… Stitch hasn't had many new features in like 10 years).”
She also noted:
“Qlik has made 18 acquisitions so far, and it seems like they average about one a year. These acquisitions in close succession might signal the start of "the great consolidation" of MDS tools people have been talking about for a few years now.”
That said, if Qlik can successfully integrate Upsolver—especially with its Iceberg features—it could help Qlik remain competitive. After all, Qlik's been around since 1993, navigating decades in a turbulent industry. Not every acquisition they’ve made has been a home run, but they’ve managed to stay in the game, and I know more than one Qlik consultant who would vouch for their staying power. So although I do agree there is a good chance Upsolver could simply become another forgotten acquisition, it makes sense why Qlik purchased them.
On the other hand, Boomi doesn’t have the same history of acquisitions….Yet!
Prior to Rivery, they only acquired two other companies. It’s also nearly doubled in size in the last five years in terms of a number of employees and was foreshadowing this M&A since last year when its CEO stated “We are in absolute predator mode,” when he was referencing the fact that:
Falling valuations precipitated by the shift of venture capital funding to generative AI, he added, “creates incredibly compelling acquisition opportunities. We’re going to be doing some M&A in the next couple of months.” - Boomi CEO: ‘We are in absolute predator mode’
In the end, many of us were discussing the very frothy valuations of most data companies back in the early 2020s, and now, there will be a fair bit of haircuts to valuations and buying opportunities for companies who have the capital.
Where is the Rest of the Year Going?
As mentioned earlier, I believe we’ll finally start to see more startups being acquired over the next year or two. Many of these companies received funding between 2018 and 2022, but since then, it’s been a matter of sink or swim. While some startups are doing fine, others would benefit from being acquired or merging to offer a fuller suite of features for their end-users.
Yuki shared another interesting thought on this:
“I think there may be some changes in the BI realm. There are newer tools like Sigma, Evidence, Hashboard, Omni, etc. I could see one acquiring another to supplement a lack of functionality or user base. Perhaps even something like Snowflake acquiring Sigma to compete with Databricks's new BI offerings.”
The BI space has always been challenging. Despite the emergence of new tools, Tableau and Power BI continue to dominate the market, making it tough for others to gain significant traction.
I could also see companies like Fivetran making moves to acquire other data ingestion tools—or even going the route of building out their own all-in-one solution.
Now that I think about it, perhaps Fivetran might look into buying out SQLMesh. After all, they are already using SQLMesh as their transform layer according to the customer list on SQLMesh’s website.
This kind of acquisition would not only help Fivetran create a more end-to-end offering but also improve its stickiness in the market. As Snowflake's direct integrations start eating into Fivetran’s ingestion market share, expanding their capabilities might be a smart play to stay competitive.
Closing Thoughts
As Lindsay pointed out, these recent acquisitions feel like the early signals of a larger consolidation trend. What I’ve learned, though, is that these cycles take years to fully play out—not just a single year. Many of us started discussing this back in late 2023 or early 2024, and it’s clear we’re only at the beginning.
Here’s why I believe consolidation is ramping up:
Clients are increasingly asking for bundled solutions that simplify their tech stack.
VCs are no longer looking to fund every data solution under the sun. There are far fewer data infrastructure companies getting $50 million funding rounds without clear traction, though they still happen occasionally.
The political climate around mergers is shifting. Larger companies like Google, Facebook, and Amazon may now have more freedom to acquire companies without the fear of regulatory pushback from leaders like Lina Khan.
That said, consolidation isn’t all sunshine and rainbows. While it’s exciting to watch these acquisitions unfold, it’s also tough to think about the companies that won’t make it. Some will run out of funding, and there won’t always be a bigger company willing to buy them out. That part of the story is hard not to feel a little down about.
But I don’t want to end on too somber a note. I’d love to hear your thoughts: Which companies do you think will merge next? What acquisitions would you like to see?
As always, thanks for reading!
Video Of The Week - Don't Lead A Data Team Before Watching This - 5 Lessons You Need To Know As A Head of Data
Join My Data Engineering And Data Science Discord
If you’re looking to talk more about data engineering, data science, breaking into your first job, and finding other like minded data specialists. Then you should join the Seattle Data Guy discord!
We are now over 8000 members!
Join My Technical Consultants Community
If you’re a data consultant or considering becoming one then you should join the Technical Freelancer Community! We have over 1500 members!
You’ll find plenty of free resources you can access to expedite your journey as a technical consultant as well as be able to talk to other consultants about questions you may have!
Articles Worth Reading
There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!
Netflix Introducing Configurable Metaflow
A month ago at QConSF, we showcased how Netflix utilizes Metaflow to power a diverse set of ML and AI use cases, managing thousands of unique Metaflow flows. This followed a previous blog on the same topic. Many of these projects are under constant development by dedicated teams with their own business goals and development best practices, such as the system that supports our content decision makers, or the system that ranks which language subtitles are most valuable for a specific piece of content.
Holy Grails of Data: Self-Service, Single Truths, and the Role of AI
For decades, data teams have been chasing multiple ever-elusive Holy Grails. These end goals get talked about heavily, but it feels as if very few data teams have indeed seen them.
And if they have, it was always at some other job.
The data world has many of them. Some are real, some aren't, and many are used by marketing and sales teams to sell the dream. In fact, when I first came into the data world, Tableau was selling self-service analytics hard. Of course, being new to the world, it makes sense; if end-users can access their data, they'll ask you fewer questions, right?
Well…sort of.
End Of Day 159
Thanks for checking out our community. We put out 3-4 Newsletters a week discussing data, tech, and start-ups.
Great graphics!
am always invincible