Many of the questions I get around becoming a data engineer or working in data involve asking what technologies should someone learn.
And I’ll tell you what one of my professors told me.
The truth is I could tell you what is popular now, and in 5 years, we could have all moved on (or at the very least abstracted some parts of it away).
Now it goes without saying that if you want to work in data or tech, there is a baseline level of technical skills you’ll need to build(learn a programming language, learn SQL, data modeling, etc).
But also, there are timeless skills, some technical, some not that have less to do with the specific technologies and more to effectively being able to solve business problems with the tools you have.
So as you build your solid base of technical skills, consider picking up some of the following timeless skills that will help improve your data career.
Thinking In Systems
“We who cut mere stones must always be envisioning cathedrals.”
― Andrew Hunt, The Pragmatic Programmer
Whether you’re a software engineer or a data engineer, you don’t write code in a vacuum.
After all, it plays a role in a larger system. There is a bigger end goal you’re trying to serve.
It doesn’t matter if you're using drag and drop, frameworks, Python, or SQL.
When you approach designing systems or even just adding to an existing one, you have to be able to take a step back and think several orders of impact beyond solving the current problem.
In this case, I am purely talking about the technical system as we could take one step back further and discuss the business and its processes.
But in this section, I am purely referring to data infrastructure. How do various pipelines interact with each other? If you change some schema somewhere, how does it impact dashboards or ML pipelines?
Did you build the system in such a way that it’s robust, easy to maintain, and change? These all play a role in systematic thinking.
Note: If you’re looking for a good book on thinking in systems, you should check out Thinking in Systems: A Primer
Otherwise, if you just approach every individual problem without thinking about the bigger picture, you end up with designs like query driven modeling.
Data Intuition
This perhaps goes without saying, but you do need to get an intuition for data, in particular how to meld data sets vs. just individual data points.
I will say I think SQL and relational modeling is the best way to do so (but I know there are plenty of people who disagree). But it’s what makes the most sense to me when you’re trying to figure out how data might fit together and where possible issues can arise.
From there, I believe you naturally start to think about the business workflows and how they translate into data. This gives you the ability to better understand how the metrics you might be creating can be impacted as well as how you could use your various business levers to change those metrics.
In addition, it allows you the ability to have a sense of where data can go wrong. You might even be able to see “figuratively speaking” where divide by zero errors could occur or other similar issues, simply by understanding the core data models and business entities that all the data you're collecting represent.
Grow Not Only Yourself But The Team
“Great teams build great software.” -
If it’s not clear that this article is based on a similar one written by Gregor, it is. One point that I liked that he covered was the importance of teamwork and helping others grow.
It’s one thing to be an amazing A-player but if you can’t help raise the level of the rest of your team, you won’t be as impactful.
Whether that’s because you can communicate your ideas well or influence people to make the right decisions, creating an environment where everyone wants to work together for the common good, is a valuable skill.
Moreover, helping up-level your team by taking the time to work with them or set standards ensures that not only the business improves their software but also your co-workers feel challenged and grow.
Constant Curiosity And A Desire To Learn
As I referenced at the beginning, technology is always changing. Now I would avoid getting too distracted by vendors and hype marketing. But I do believe it’s a good idea to spend some of your time learning what is happening in the data and business worlds.
Are there new technologies that could help simplify an implementation?
Is there an older technology you haven’t had time to dig into that you want to learn?
How does your business actually make money and what are the current headwinds?
Etc.
Take some time to make a small side project or dive deep into a solid research paper to understand a topic more granularly.
Get some time with leadership to better understand more about the business and what makes it function(this might depend on how small your company is or what level of leadership you are at).
Even as you yourself grow and perhaps go down the managerial route, don’t forget to keep your finger on the pulse of technology. That ensures that you can talk both to the business as well as other ICs at the right level so they know you understand what is going on.
It’s funny because I talked with Tom Rampley on a Youtube live recently and he referenced that most of what he reads is non-technical as the Head of Data at Last Pass. Yet he showed off several well-read technical books that were highly annotated on screen.
Keeping curious and constantly learning what is going on in both the technical and business world, ensures you can make better decisions.
These are just a few skills that you’ll want to get good at, but there are so many more that go beyond learning how to program in another language.
If you’d like to read more, here are a few good articles.
3 Critical Skills You Need to Grow Beyond Senior Levels in Engineering by
How to lead projects from start to finish as a software engineer by
Final Thoughts
Technology is always changing. But being able to implement systems that are robust and maintainable will always be valuable (regardless of the tools) as will bridging the gap between data and business.
Working in data is a lot more than just being able to write code. The sooner you can think about next-level problems, the sooner you’ll grow and better be ready for changes in the tech world.
As always thanks for reading.
Video Of The Week - Going From Data Engineer To Head Of Data - How To Run A Data Team Successfully
Join My Data Engineering And Data Science Discord
If you’re looking to talk more about data engineering, data science, breaking into your first job, and finding other like minded data specialists. Then you should join the Seattle Data Guy discord!
We are now well over 7000 members!
Join My Technical Consultants Community
If you’re a data consultant or considering becoming one then you should join the Technical Freelancer Community! We have nearly 500 members!
You’ll find plenty of free resources you can access to expedite your journey as a technical consultant as well as be able to talk to other consultants about questions you may have!
Articles Worth Reading
There are 20,000 new articles posted on Medium daily and that’s just Medium! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!
Memory Efficient Data Streaming To Parquet Files
Apache Parquet, a columnar storage file format, has become a standard for data storage due to its efficient data compression and encoding schemes.
However, streaming data into Parquet files in a memory-efficient manner presents significant challenges, especially in memory-constrained environments since streaming is usually a record-based operation, while Parquet is a columnar format.
End Of Day 137
Thanks for checking out our community. We put out 3-4 Newsletters a week discussing data, tech, and start-ups.
Thank you so much for sharing this! I wonder if you have any YouTube walkthrough projects you would recommend to an entry-level data engineer to work on to both robust her skills and create a great portfolio.
Excellent advice that applies broadly to all engineering roles. Thank you for sharing & mentioning my article!