Where Is Data Going In 2023 - 3 Trends I Am Seeing In The Data World
2022 is coming to an end and the data world continues to turn. Although the foundations don’t change, the unstable markets, interest rates, and looming recession are causing constant shifts in what the business cares about.
Some businesses are cutting their data teams in half to reduce costs as a recession nears. Others are looking to increase their reliance on data by improving data quality.
As I am talking with various teams and reviewing the results of the State Of Data Survey, several trends are sticking out.
In this article, I will be discussing those trends as well as outlining some of the initiatives our team will be taking on in 2023.
Big Talk About Data Quality
Data quality has always been important. When I first started writing it was one of the topics I covered on several occasions. It’s also been covered by Zach Wilson, Sarah Krasnik and Matt Weingarten to name a few others.
Of course with the recent focus on data quality and observability, they have become big businesses. Companies like AccelData and LightUp have both landed contracts with enterprise companies, and that’s not even touching the hundreds of millions of dollars raised for the category in 2021.
Data quality was also the 3rd most referenced problem on the State Of Data Survey that my team is running. At the end of the day it's the classic data adage: garbage in, garbage out. Companies like Airbnb and Lyft are figuring out that if they want to have complex machine learning models, or even basic analytics, data quality is a must.
The focus on data quality is far from new.
DAMA and other organizations have been touting the importance of data quality. Back when we were rushing for the cool new ML models. So now, we’re playing catch up to figure out how our data scientists and analysts can better trust their own data and build production-ready systems and models. In the end, that might be a little harder based on the current budget discussions many companies are having.
Doing More With Less
Over the past 3-4 weeks, I have talked to several companies and data teams that have let go of more than 50% of their data staff. I would say, anecdotally this has mostly been in the small and medium businesses, mid-market and tech space.
If you have heard of or seen similar occurrences on the enterprise level, please feel free to contact me.
Another, perhaps similar trend I’m hearing from both vendors and Heads of Data, is the phrase “Doing More With Less”. In particular, I know several all-in-one solutions that are going to lean hard on this trend to make an argument for a simpler, fully configured solution on day one.
However, there is what is said and what is happening.
I believe it less about doing more with less and more about actually connecting data work with business goals. For the past near decade the charge has been… “data-driven” at any costs. Now companies need to rationalize that spend. Those dollars need to be connected to ROI.
Overall, some of the companies that have let go of their entire staffs will still need to answer data questions.
Further Specialization Of Roles
In terms of data engineering, I expect there to continue to be further specializations of roles. I pointed this out in a previous article where I talked about different types of data engineering, especially at companies that have mature data infrastructure.
This will likely be amplified as new solutions take away the need for repetitive, low value-add tasks that can be automated away. Like creating the same data connector over and over again. Forcing the value driven by individuals to be further specialized. Whether that be by data engineers leaning heavier on the coding or analytical side.
If you sit on the programming side, you will likely need to solve specialized issues around streaming and data management. On the flipside, if you are more analytical, you will likely need to be even more aligned with the business so you can create value by driving initiatives that integrate data directly into the business strategy.
Seattle Data Guy In 2023
There is a lot going on in the data world, and our team is working to help create content, connections and community to help you keep up to date.
Community - In 2023, I plan to further invest in our efforts to grow the data community. This means more Data Happy Hours, where speakers and practitioners can network and share their experiences.
If you haven’t signed up for these events, feel free to sign up here.Collaborations With Other Authors - I really enjoyed my experience collaborating with
and I would like to do more of those in the future. I am talking with experts in fields such as machine learning and data platforms to have them share their knowledge here. The data world is vast and there are so many different types of data engineers, analysts and scientists that I believe it would really benefit the community.Live All Day Conference - For those who haven’t been able to make it to
the live events our team has put together, you can attend our all day conference. We have lined up several great speakers including Chad Sanderson, Mei Tao, Sireesha Pulipati and more.
If you’re interested in attending, then please sign up here.State Of Data 2022 - Our team has been surveying data leaders and practitioners from companies of all sizes and we are quickly coming up against our deadline. We need your help to reach our goal so we can share the results with y’all! Please fill out our data survey so we can better understand the state of data infra going.
Where Is 2023 Going For Data Teams?
The past few months have shook up the data world for many people. Teams are re-prioritizing initiatives and in some cases being let go, new start-ups are continually being funded and more people are trying to connect despite working remotely.
Also I didn’t even bring up is an increased focus on security and governance…
One thing I believe is that the need for data will continue to grow. Yes, we will need to ensure that the usage of data is aligned with the business. Spending time on projects like infrastructure for infrastructures sake or deploying a MLOps stack when you’re still struggling with basic data management will likely take a backseat. At least at smaller organizations.
At least, that’s my hope.
What other trends are you noticing? Where are companies investing their time and business strategies?
Take Control of Your Customer Data With RudderStack
Legacy CDPs charge you a premium to keep your data in a black box. RudderStack builds your CDP on top of your data warehouse, giving you a more secure and cost-effective solution. Plus, it gives you more technical controls, so you can fully unlock the power of your customer data.
Jobs
Developer Advocate, Staff or Principal - Confluent
(Sr.) Analytics Engineer - Seat Geek
Staff Software Engineer - Preemo
Join My Data Engineering And Data Science Discord
Recently my Youtube channel went from 1.8k to 48k and my email newsletter has grown from 2k to well over 30k.
Hopefully we can see even more growth this year. But, until then, I have finally put together a discord server. Currently, this is mostly a soft opening.
I want to see what people end up using this server for. Based on how it is used will in turn play a role in what channels, categories and support are created in the future.
Articles Worth Reading
There are 20,000 new articles posted on Medium daily and that’s just Medium! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!
The cloudy layers of modern-day programming
Recently, I’ve come to the realization that much of what we do in modern software development is not true software engineering. We spend the majority of our days trying to configure OpenSprocket 2.3.1 to work with NeoGidgetPro5, both of which were developed by two different third-party vendors and available as only as proprietary services in FoogleServiceCloud.
Vulnerability Management at Lyft: Enforcing the Cascade - Part 1
Over the past 2 years, we’ve built a comprehensive vulnerability management program at Lyft. This blog post will focus on the systems we’ve built to address OS and OS-package level vulnerabilities in a timely manner across hundreds of services run on Kubernetes. Along the way, we’ll highlight the technical challenges we encountered and how we eliminated most of the work required from other engineers. In this first of two posts, we describe our graph approach to finding where a given vulnerability was introduced — a key building block that enables automation of most of the patch process.
End Of Day 63
Thanks for checking out our community. We put out 3-4 Newsletters a week discussing data, tech, and start-ups.