Discover more from SeattleDataGuy’s Newsletter
Looking Past Data Infrastructure - How To Deliver Value With Data
A Conversation With Gordon Wong - Ex VP Of Business Intelligence At HubSpot
Today, I wanted to take a deep dive into people who have led and managed data teams, whether it be BI or data engineering. Some data teams are likely starting to feel the pressure to deliver actual ROI.
I talked with Gordon Wong, who has worked as a VP and director of business intelligence and data engineering at companies like HubSpot and Fitbit. In both cases, he has led successful initiatives to design and deploy data warehouses as well as actually deliver ROI in a data team. Of course, he has done far more than just being a data team leader.
With a background as a database developer and in GIS, he has led data teams and delivered technical solutions throughout his career.
I recently met Gordon while partnering with him on a project, and he clearly illustrated his depth of knowledge both technically as well as on the business side. In particular, what stuck out was his focus on helping the business better understand its own pains and how it could drive ROI with data.
Some companies are looking at their data projects more carefully and asking the question, “what is the value?” I wanted to get Gordon’s perspective. After running several data engineering and BI organizations, how does he assess value in data projects?
In this issue, we cover:
Philosophy on Delivering Value
Delivering vs. Building
Driving ROI Despite Department Incentives
Help Your Clients Be Good Customers
How To Set-up Teams for High ROI
I really enjoyed this discussion with Gordon, and I hope you will too. So let’s dive in!
Philosophy on Delivering Value
First, to understand some of the examples below, I asked Gordon to provide some insights into his philosophy of delivering value.
Here are the points he provided.
Know who your customers are, what they care about, and how well you are helping them now.
Start all work from the end. Work backward from the business problem.
Work in small, production-strength iterations. Everything you release should fill a need and be safe to use.
Delight your stakeholders in the definition of “Done Done.”
Build teams grounded in psychological safety but driven by empathy.
All this being said, here are a few examples of how Gordon has delivered value in his past experiences.
Delivering vs. Building
During the early 2010s, Hadoop was all the rage. I personally recall coming into the data world during this time, and it was all everyone was talking about. Every conference you went to had HortonWorks and Cloudera either sponsoring or running.
But Hadoop was far from a perfect solution.
So when Gordon referenced his experience at one company, it truly resonated. During our discussion, he outlined a project in which two teams picked different data infrastructures to build their data processes.
One team picked Hadoop, while another picked Snowflake.
One team of ten engineers spent several years setting up the hardware, software, map-reduce jobs, and other workflows to run Hadoop. Along the way, they had to retrofit security, data quality testing, PII management, and operational management.
The other spent about a year with half the engineers to spin up Snowflake and the required data processes. They worked in SQL, a well-understood language, and used out-of-the-box features.
The end result was the Hadoop team spending nearly $3 million before even delivering a dollar of value while the other spent about $1 million. This was before salaries!
The build vs. buy discussion will always continue to rage on. But there is a far more important discussion about actually delivering. There are plenty of teams on both sides of the build vs. buy discussion that never actually deliver.
So regardless what side of the isle you’re on, data teams are expected to actually deliver value.
Since today is Valentine’s day, I wanted to give my readers a Valentine’s day discount!
Driving ROI Despite Department Incentives
Another example I feel would resonate with many readers is the fight we often have with the business. It’s strange that we are in 2023 and there are still so many companies with disjointed IT and business teams.
The marketing teams have their own incentives, as do the finance and operations teams, and this leads to them sometimes fighting the results that data teams bring forward.
In another example provided by Gordon, he discussed how he worked at one company in which the metric that the marketing team used to judge their performance was the number of letters/emails sent out.
I assume most readers can see the issue here. This metric is based on input and not some form of business output. This also cost the company hundreds of thousands of dollars because they were sending letters to people on their “do not send list” (along with other implications).
As a quick call out, one point Gordon made was that one of the lessons he learns over and over again is the importance of:
Finding the real metric - Gordon Wong
In this case, by shifting the discussion to what the real metric should be, he saved the company several hundred thousand dollars just by turning off things that weren’t working.
He also added to this point by saying he is pretty sure most companies could benefit from taking a moment and turning off campaigns that just aren’t working.
It’s often easy for many individuals to focus on or be incentivized by the wrong metric. A quick personal example is content creation. It can be tempting to look at your follower and subscriber counts and assume that’s what matters. But that is often the wrong metric. Instead, its often some ratio of engagement to followers that actually matter.
The wrong incentives and focusing on the wrong metrics will cost companies money. They’ll invest in the wrong campaigns and generally make poor decisions.
In the end, part of our job as data folks isn’t just to do the technical work, but to help guide our stakeholders.
Helping Your Clients Be Good Customers
Being empathetic and understanding where clients and partners are coming from is a necessary standard to set. It’s easy as an engineer or a technical individual to believe that everyone wants to be data-driven at all times.
Or that everyone views improving infrastructure as a goal.
But that’s not the case. True empathy requires really understanding what a person’s goals are, what their concerns are, and where they don’t feel safe. If you go to another business leader and outline a plan to manage their team activities, measure their performance, and help them improve their department, it may come off as more of an attack on their leadership style than a helping hand. Make sure your customers know you are there to help before you start tearing things up.
SeattleDataGuy’s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
I want to thank Gordon so much for talking to me and sharing his experiences. He has delivered solutions both personally and as a leader on multiple teams.
He also has a very clear mission when it comes to working with clients. In particular, his focus is, “what is your pain,” not just from a technical side, which involves caring about performance and optimization, but also from a business side.
He clearly wants to ensure that business executives and stakeholders dig deeper into their why.
Why, as a business, do you even want to be data-driven?
What pains are they actually trying to solve, and is some significant data infrastructure investment event required?
Why even invest in data to begin with?
These are all valid questions that I know not every business team answers. Sometimes they just happened to have read an article, perhaps like mine, that over-excited them about what they could do with data instead of aligning their data and business strategy.
I hope you found these insights into an experienced data practitioner helpful. Feel free to comment or ask questions below!
Data Events You’re Not Going To Want To Miss!
Join My Data Engineering And Data Science Discord
Recently my Youtube channel went from 1.8k to 55k and my email newsletter has grown from 2k to well over 36k.
Hopefully we can see even more growth this year. But, until then, I have finally put together a discord server.
I want to see what people end up using this server for. Based on how it is used will in turn play a role in what channels, categories and support are created in the future.
Articles Worth Reading
There are 20,000 new articles posted on Medium daily and that’s just Medium! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!
Scaling Media Machine Learning at Netflix
In 2007, Netflix started offering streaming alongside its DVD shipping services. As the catalog grew and users adopted streaming, so did the opportunities for creating and improving our recommendations. With a catalog spanning thousands of shows and a diverse member base spanning millions of accounts, recommending the right show to our members is crucial.
Why should members care about any particular show that we recommend? Trailers and artworks provide a glimpse of what to expect in that show. We have been leveraging machine learning (ML) models to personalize artwork and to help our creatives create promotional content efficiently.
Data Types on Delta Lake.
What I want to find out is very straightforward, how much do data types really matter? I mean if it doesn’t make a big impact on performance or data size usage … it’s just good to know. If we are working on a multitude of Delta Lakes, it seems like an understanding of what sort of impact having the wrong data type makes could lead to some different things. Most all Delta Lakes are used with Apache Spark, usually on Databricks.
How close attention should we pay to
Does it make a Delta Table much larger in size (increase storage costs)?
When using Delta Lake + Spark, is
End Of Day 71
Thanks for checking out our community. We put out 3-4 Newsletters a week discussing data, tech, and start-ups.