Are companies letting down data engineers, data scientists, and software engineers?
It feels like some may be.
Some companies aren’t creating environments where newly graduated technical practitioners can grow. Instead, in many cases, they are sometimes left to their own devices, given poor onboarding experiences and limited documentation on standards and best practices.
This was my experience when I first started working in data. The senior engineer who I was working with quit two weeks in and I was left to try to figure out how to manage everything they left behind. Data warehouses they had half finished, automated reporting they had set up in VBA, and a website they had started to develop to host all the team's dashboards.
Why is this even a problem? Can’t you just watch some youtube videos or take some Udemy courses to upskill?
Although these are great methods to quickly learn and get up to speed on technical skills. There is always a difference between theory and practice and not having a more experienced engineer or data scientist that can help guide you at important crossroads can be limiting in terms of personal growth.
It’s also a great way to lose new talent who will seek out mentorship elsewhere. Whether that be on another team or a different company.
Or they might stick around and to the best of their abilities start to put together non-maintainable systems.
As put by one author.
They’d rather let a 23-year-old who knows how to pip install jupyterlab run loose and self-manage, or manage alongside other similarly situated 23-year-olds. Where is the adult in charge? - Goodbye, Data Science
So what should companies do to avoid this issue?
Here are a few ways I believe companies could improve their processes to both improve employee experience as well as the output of their teams.
Have Clear Documentation And Procedures
When you first come out of college you have a general idea of the technical skills you need. But knowing how to apply them best takes time. If left to your own devices, then you’ll likely make the same mistakes many have made before you.
You’ll create an overly complicated process to deploy a machine learning model or design a set of tables and data pipelines that are prone to failure. It’s not because you mean to, but because there might be small details you may be overthinking or never realized were important.
For example, when I was first exposed to Tableau I pushed out a few dozen dashboards without fully thinking about what they were really for. Maybe one got used, while the rest were unnecessary and a waste of my time. This probably could have been avoided by my asking, why or by having a process that encouraging teams interested in said dashboards to connect them to a business case.
Hire A Senior Engineer
It seems to go without saying that a company should hire a senior technical employee that can help set the tone for a team before hiring several more junior team members (Of course there is nuance here, you can’t expect a senior engineer to spend all day mentoring). Yet, I have seen time after time where either a senior engineer quits and is never replaced or a company wants to get away with paying less so they only hire junior employees.
This works until it doesn’t. In particular, once said junior employee leaves, it becomes clear how much they were likely just manually making sure the entire system was functioning. And they were the only ones who know how it all worked.
More experienced engineers have likely already made many of the mistakes you’re going to want to avoid when developing your future projects. They also likely have a good understanding of how much effort it will take to complete a piece of work and whether or not it's worth it for the team to invest time into said work.
Hopefully, they're also helping establish standards and mentoring their team on how to develop maintainable systems.
Create A Clear Review Process
One of the biggest benefits I have had in my career is going through code and system design reviews. Both in terms of having my systems reviewed and reviewing others' designs. These are an opportunity for everyone to gain new perspectives and insights into how their colleagues solve problems.
In addition, when you’re just getting started in your career, you’ll make plenty of judgment calls based on a textbook or course you took, and code reviews are a great place to learn more about the trade-offs of that decision in production.
Remove Ambiguity In Expectations
Too much ambiguity causes a lot of issues in the technical world. Projects fail and stakeholders can end up unhappy if there is too much ambiguity in what is being built. That’s why part of many project management processes involve reducing it.
In the same way, when it comes to expectations, whether it be around promotions, levels, and how projects are delivered, people need clarity.
Yes, part of working in tech is dealing with ambiguity. But, when it comes to expectations and leveling I do wish more companies were clear on what is expected. I have just heard too many stories of individuals whose managers hinted that would be getting a promotion only to not receive it. Of course, there is no recourse when it wasn’t really clear what behaviors would lead to a promotion.
This can lead to further confusion and frustration. This not only leads to employees quitting but some becoming disenchanted with the whole idea of improving their careers in general. There are many CEOs and investors out there talking negatively about quit quitting when one of the points being made is the lack of a clear process for getting raises and promotions
Can We Be Better?
Creating environments where employees of all experience levels can flourish is challenging and time-consuming but I do believe it would benefit most companies. Perhaps that’s because 60% of the projects I have taken on this year have been key person dependency projects where I had to reverse engineer systems with little documentation.
It could also be the 15-20% of projects where I was focused on upskilling more junior employees so they could manage entire data stacks and cloud infrastructure.
Truthfully, it probably all started when I first started working in the data world and was left to attempt to develop systems based on my limited experience in the space.
So perhaps I am biased. But I’d love to hear your thoughts. Can companies do more to ensure employees of all experience levels feel like they can grow?
Build SQL Pipelines. Not Endless DAGs!
With Upsolver SQLake, you build a pipeline for data in motion simply by writing a SQL query defining your transformation.
Streaming and batch unified in a single platform
No Airflow - orchestration inferred from the data
$99 / TB of data ingested | transformations free
Jobs
Technical Lead / Architect - Data Sentinel
Data Engineer - Mutt Data(LATAM)
AI/ML - Senior Data Engineer, Data Governance - Apple
Staff Software Engineer - Preemo
Staff Software Engineer - Java - Walmart
Join My Data Engineering And Data Science Discord
Recently my Youtube channel went from 1.8k to 48k and my email newsletter has grown from 2k to well over 28k.
Hopefully we can see even more growth this year. But, until then, I have finally put together a discord server. Currently, this is mostly a soft opening.
I want to see what people end up using this server for. Based on how it is used will in turn play a role in what channels, categories and support are created in the future.
Articles Worth Reading
There are 20,000 new articles posted on Medium daily and that’s just Medium! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!
Goodbye, Data Science
This is more of a personal post than something intended to be profound. If you are looking for a point, you will not find one here. Frankly I am not even sure who the target audience is for this (probably “data scientists who hate themselves”?).
I had been a data scientist for the past few years, but in 2022, I got a new job as a data engineer, and it’s been pretty good to me so far.
I’m still working alongside “data scientists,” and do a little bit of that myself still, but most of my “data science” work is directing and consulting on others’ work. I’ve be focusing more on implementation of data science (“MLops”) and data engineering.
Beyond prompt engineering
Generative models are transforming how we create. The output of these models is now, in many cases, indistinguishable from human-generated content. Text prompts have emerged as the primary interface to these models. Yet, text prompts are a lossy interface for harnessing the power of generative models, and non-obvious prompt engineering is required to generate high-quality aesthetics. Next-generation models and the applications built on top will have to improve usability by expanding beyond text interfaces.
End Of Day 61
Thanks for checking out our community. We put out 3-4 Newsletters a week discussing data, tech, and start-ups.
Firstly, good work on this article, the flow of your points are well structured and your words are well written. Based on what you have described, you have clearly been fed to the wolves and put through the gauntlet. Do not let this discourage you or make you think your company is doing this on purpose to torment you as you do reverse engineering (again) on your next project and left to fend for yourself (again).
There is a data engineering joke of a kind of email you can get in your nightmares, “good day, you are being pulled in today to work on a critical data project, which is in firefighting mode because it was written all so poorly and nothing is loading, and while you’re at it, please document the system as well?”
Jokes aside, what you have gone through and continue to go through are birthing pains (another overused idiom in this industry) that will, in the course of a few more short years, forge in you the strong foundations of a senior data engineer …the one you never had.
In our line of work, no one is owed complete documentation, squeaky clean data, pipelines don’t have issues, and processes that do what they are supposed to do. Some projects are even to do exactly just that (there are projects and teams out there in the multi million dollar industry of documenting, reverse engineering, untangling, and optimizing spaghetti data platforms—another man’s trash is another man’s treasure indeed!)
But that doesn’t mean it has to be that way, at least not all the time.
That being said, if you are a junior data engineer and are reading this, the best advise I can give is to know what you do not know as much as knowing what you do know.
The best junior data engineers I have worked with were not always the smartest ones. It was the one who tried to set boundaries and expectations, raises their hand when asking for help, and says no. Don’t be afraid to say you can’t, but also be specific in what help you need. Then be eager to step up where you can.
There is zero talent needed for being dependable and showing up.
All the best!! And keep up the great work here. —Alfonso R.
https://www.linkedin.com/in/jose-alfonso-ramirez-a9686aa7
A lot to unpack here, in general too many companies tend to focus on outcomes and not process which can contribute to the single person dependency issue that is so common for businesses of all sizes. As a non-technical founder, I'm 100% reliant on TRUST and HOPE that developers are following best practices what could go wrong? After learning first hand what can go wrong, short answer: everything. I now contract with a third party to audit developer processes for documenting all the work being done, with the goal that should I be able to hire a new developer, a new development firm and they should be able to pick-up where the previous team left off. It's not cheap, but it does help me sleep better at night. I may be ready to hire my first developer this year, I'll spend the money to hire a senior one. Thanks Ben!