Hi, fellow future and current Data Leaders; Ben here 👋
Before we dive in, a quick plug for Retool. Their new AI-assisted dev capability lets you build internal tools on top of your live data—without wrangling a prototype or settling for a static BI dashboard. Describe what you need, and Retool creates a secure app you can fully customize. Perfect for building interactive dashboards and admin tools connected to your warehouse, dbt, or orchestration metadata.
Now let’s jump into the article!
It’s the early 1950s, and an American journalist named Edward Hunter begins publishing a series of reports(and eventually books) about U.S. prisoners of war in Korea. He claimed they were being “brainwashed”.
The term caught fire. It appeared in headlines, films, and congressional hearings. It’s important to note that Hunter wasn’t merely a journalist; he was a seasoned propaganda expert who had served in the OSS, the predecessor to the CIA.
The U.S. military seized on the idea of “brainwashing” to discredit confessions made by American POWs, including statements admitting to biological warfare. Depending on the source, some claimed that the word “brainwashing” was injected into the English language solely so the military could have a defense against the American POWs who were making said statements.
Now, why am I writing about this?
The Seattle Data Guy who now lives in Denver?
Because many of the words we use in our day-to-day as data engineers and analysts were only created in the past few decades, and yet many of them dictate our choices on tooling, on design, and the work we take on. So I wanted to dig into how the words that we use play that role.
Words Shape the Way We Work
I used brainwashing as an example because it’s a little dramatic, and it shows how language can shape how we see the world, how it can excuse behavior or choices because we now have a word to define it. But not every term is born from manipulation. Sometimes, we’re just trying to name something new we’ve noticed, like a pattern or behavior.
And when you do it well, when you manage to wrap something complex into a simple phrase, the word will spread. It creates a shared language. It makes it easier for others to build on your thinking and to bring structure to what was previously vague.
So I don’t view this as inherently bad or negative. But it can limit our perspective on the world. These words and terms are also often self-replicating, passed from one company or organization to another long after their meaning has drifted.
It All Started With The Data Warehouse
It’s hard not to start this dive into terms that have shaped the data world without mentioning data warehousing. It remains a central part of our work today.
It’s why you’re stuck on the migration project from SQL Server to Snowflake.
It’s why everything seems to revolve around lakes and lake houses. It’s all merely an evolution or variation of that term. There might be a better approach, but why question it?
Speaking of the Lakehouse.
I’d be remiss not to reference the Data Lakehouse. I think you could argue that this one concept alone(and Databricks putting everything they could behind it) willed Databricks into the spotlight as more than just a managed Spark service, even though it has been difficult to shake that label even to this day.
It’s funny because Databricks wasn’t even the first company to use the term. But Databricks did everything they could to legitimize the term even further. The first article around the topic was around January 2020, and then somewhere in early 2021, the idea took off(see Google Trends below).
Again, I do think this was capturing an idea that had already been floating around. It seemed like a natural combination of Data Warehouse and Data Lake, which is why I believe the idea was so potent.
We were tired of having to build both.
Snowflake was likely very aware that this term gave Databricks a new mental road in people’s minds. This was an opportunity for Databricks to start taking workloads from Snowflake. Thus, much of Snowflake’s marketing opted for the term “Data Cloud”.
In fact, I recall a video I recorded with a Snowflake employee that was never aired. I mentioned Snowflake as a data warehouse, but the employee insisted I re-record it and never let that video see the light of day.
Personally, I think Data Cloud lacks a certain level of graspability. It really is meaningless.
If you’re creating a term, it needs to capture your audience effectively. If it’s so broad that it could literally mean anything(as I believe data cloud does), it probably won’t catch on.
Death Of Terms
Some words and concepts last for decades.
Others barely make it a decade.
When I first started in the data world, various vendors were pitching schema-on-read, and not too long after that, the term Modern Data Stack came around.
Perhaps one of these terms was created with limited longevity in mind. I say that mostly in jest. I think most companies have no idea what terminology will stick. Maybe that’s why Databricks keeps trying to invent new terms.
But like some viruses, a population might slowly become inoculated to an idea. Perhaps due to overexposure or due to the vaccine of reality, people start to question it. Break it down and push it aside. Once enough of a population has rejected an idea, it slowly ceases to exist.
That’s what has seemed to happen to both schema-on-read and now Modern Data Stack. Well, for many, the idea of the MDS passed away a while back.
Final Thoughts
I write this article both for myself and for everyone reading. The terms we use in the data world are useful and can help us have a shared language.
They can also trap us in our current world, where we limit our ability to see beyond the walls of our own minds.
Those ideas can be hijacked by vendors to sell their products.
They can also play naturally in a space that was always looking to be better defined.
We’ve had so many terms floated in the past few years that it can feel overwhelming.
And I bet that shortly we’ll have some new terms that redefine an all-in-one data stack coming out real soon.
If you’re just getting started in your data career, I owe you an article with a longer breakdown of all the terms and their timeline almost like a vaccine to ideas.
As always, thanks for reading.
Upcoming Data Events I’ll Be At
Articles Worth Reading
There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!
Why is everything so scalable?
I’m entirely convinced that basically every developer alive today heard the adage “dress for the job you want, not the job you have” and figured that, since they always wear jeans and a t-shirt anyway, they might as well apply it to their systems’ architecture. This explains why the stack of every single company I’ve seen is invariably AWS/GCP with at least thirty microservices (how else will you keep the code tidy?), a distributed datastore that charges per query but whose reads depend on how long it’s been since the last write, a convoluted orchestrator to make sure that you never know which actual computer your code runs on, autoscaling so random midnight breakages ensure you don’t get too complacent with your sleep schedule, and exactly two customers (well, potential customers).
If You’re New to Data, Read This Before You Build Anything
When you’re just getting started in data, everything feels exciting, and everything sounds like a good idea.
“Oh, this process takes ten minutes? I’ll automate it with VBA!”
Fast-forward four weeks, and you’re trying to meet the finance team’s expectations, reconciling numbers, and banging your head against a table, wondering why you ever volunteered.
New paradigms show up with shiny names and polished diagrams. They sound smart. You’ve got nothing to compare them to, so you try them. After all, everyone else seems to be.
That’s what this article is about: the things I wish I understood earlier in my data career. It started because there were so many buzzwords and new products popping up in the past few weeks, it might not be clear what’s actually going on if you’re new to the data space.
Whether you’re early in your data career or just want a sanity check, here’s a breakdown of the stuff that actually matters (and the fluff that doesn’t).
Let’s dive in.
End Of Day 198
Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.
If you enjoyed it, consider liking, sharing and helping this newsletter grow.



Great insight.
You’re searching for a unifying term for these evolving layers we’re building upon? I use the term "data platform". It doesn't even feel like we need to add "cloud" or not. It captures the full scope of infrastructure, orchestration, governance, quality, and engineering. The phrase also reflects how vendor offerings have expanded beyond their original focus into more integrated, end-to-end ecosystems.
Great article and wonderful step back on a concept - word meaning - that hovers in the background but has a huge impact.
Can you suggest any materials - articles, docs, books - on the GDI workflow/process? I’m searching for comprehensive evaluations of the GDI workflow.
Thx