One of the tips you’ll hear from more experienced engineers, including myself, is you should build a side project.
Side projects are an excellent way to learn and show off your skills.
And I don’t believe this is bad advice (or I wouldn’t give it).
However, if you’re just following a tutorial and copying the code, you’re likely not learning as much as if you're trying to build your own side project.
Let me explain.
The Problem
Now perhaps I am not the best person to give this message as I am notorious for starting project series and never finishing them.
That’s why I am going to have the theprimeagen explain it in a short below.
Actually, this is the short that inspired this article.
The learning of projects doesn’t happen in the building. Instead, it happens in the learning how to:
Troubleshoot
Find the right information
Fail and keep moving forward
Additionally, it provides you with a deeper understanding of the tools you’re using. Oftentimes when you’re troubleshooting, it’s impossible not to gain some greater depth of understanding of the libraries or tools you’re using.
On the other hand, when projects are well-tailored and you just copy the code you see, it’s like flipping to the back of a math textbook and being like…”Oh that totally makes sense.”
Sure, your brain is now seeing how the problem was solved. But it never really went through the effort of solving it.
Now maybe that works for you, but it 100% does not work for me. I can look at a similar problem and kind of recall the steps but it’s never the same as struggling to go through the steps myself and getting it wrong a few times until I finally get it.
Now, let’s make this worse. In the programming world, someone can build an entire project walk-through, but due to changing libraries or versions, it’ll eventually stop functioning.
In fact, I just had this happen.
An Example With AWS Lambdas
Recently, I put out a basic tutorial on how data engineers could use AWS Lambdas. But what you didn’t see if you were watching the video is that between my creating the initial script and then going through it later, I ran into several issues with the layer.
Side note: I forgot to put the link on how to create the layer in my video, so here it is if you do decide to follow the video.
Now when I attempted to rebuild the Lambda later on, I suddenly ran into an error. The error involved the requests library, and even though I had already created the layer, I realized there was now an issue likely with the version of the requests library.
So although I had gone through the trouble of making the original version work, I still had to re-test, troubleshoot and rebuild that layer.
But if you were following along, you wouldn’t realize how to troubleshoot the issue; you’d just see it functioning as expected. So what happens if you end up with a different issue?
Would you give up or would you try to figure out how to fix the issue?
A Quick Caveat Before We Talk Solutions
Just in case someone out there thinks I am saying watching data engineering project videos or tutorials are pointless.
I am not.
As Andrew points out below, often the first phase of learning is just watching.
What I believe the risk is that you might find yourself in an endless loop of projects and tutorials and never expand beyond that. I’ve talked about a similar issue in the tutorial hell article I wrote a while back.
The key is you need to quickly go from just watching to actually doing and trying to take your skills to the next level.
How To Learn Beyond Project Videos And Articles
Again, videos are a great place to start.
But here is how I think about them.
When I was going through my first programming class and was first learning about arrays, our professor taught us:
What arrays are
How to loop through them
Lots of Is and Js
etc
However, it really was just the basics; he never taught us how to apply it much beyond a few homework problems and readings.
Then on the test, I still recall how he asked us to merge and sort two arrays. Now I had never thought about how to even merge two arrays, let alone sort them (I think I only ended up getting like 60-70% of the answer correct), but it forced me to think about something that I had already learned in a new way.
So when you watch a project or a tutorial, I love the idea of trying to force yourself to make it your own.
Take one tool out, and switch it for a similar solution.
Find a different data set, but try to go through and create a different end-result.
Add in a new set of business logic or integrate a different data set.
But don’t just follow a project example line by line and then end up like the image below.
What’s Next?
Overall, I think you project videos are a great place to start your journey. They can help inspire you and give you ideas. However, I wanted to share a few thoughts and tips as you go on your journey and take on your own projects!
Review Projects But Make Them Your Own!
Check out projects like
‘s one called 🚖 Uber Data Analytics | End-To-End Data Engineering Project which uses Mage, BigQuery, and Looker Data StudioSome ideas of how to make it more of your own
Instead of using BigQuery, try Snowflake or Databricks
Focus heavily on one aspect of the project like digging into data warehousing or data quality
etc
Come up with your own project ideas
You can read my past article - How To Start Your Next Data Engineering Project
You can use some of these - APIs to build a project around
Then just pick a tool to ingest, store and serve the data and get started(always make a project simpler than you think it should be)
The point is, you can’t just limit yourself to following videos. I am sure there are a few people out there who can learn just by watching videos and then your brain retains every second of a said video. But not mine.
Also, it lets you put a more unique project on your resume and who knows, maybe it’ll become more than just a side project!
Thanks for reading!
Join My Data Engineering And Data Science Discord
If you’re looking to talk more about data engineering, data science, breaking into your first job, and finding other like minded data specialists. Then you should join the Seattle Data Guy discord! We are close to passing 6000 members!
Join My Data Consultants Community
If you’re a data consultant or considering becoming one then you should join the Technical Freelancer Community! I recently opened up a few sections to non-paying members so you can learn more about how to land clients, different types of projects you can run, and more!
Articles Worth Reading
There are 20,000 new articles posted on Medium daily and that’s just Medium! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!
When to use GraphQL, gRPC, and REST?
by
Building APIs is one of the most important tasks for developers in modern engineering. These APIs allow different systems to communicate and exchange data. While REST has been the de facto standard for implementing APIs for many years, new emerging standards, such as gRPC and GraphQL, are available today.
This issue will discuss gRPC, GraphQL, and REST API architectures. We will determine each one's advantages and disadvantages and which tooling can be used.
Reverse Searching Netflix’s Federated Graph
Since our previous posts regarding Content Engineering’s role in enabling search functionality within Netflix’s federated graph (the first post, where we identify the issue and elaborate on the indexing architecture, and the second post, where we detail how we facilitate querying) there have been significant developments. We’ve opened up Studio Search beyond Content Engineering to the entirety of the Engineering organization at Netflix and renamed it Graph Search. There are over 100 applications integrated with Graph Search and nearly 50 indices we support. We continue to add functionality to the service. As promised in the previous post, we’ll share how we partnered with one of our Studio Engineering teams to build reverse search. Reverse search inverts the standard querying pattern: rather than finding documents that match a query, it finds queries that match a document.
End Of Day 121
Thanks for checking out our community. We put out 3-4 Newsletters a week discussing data, tech, and start-ups.
I just relized it, thank you so much
Thank you for the mention Benjamin!