This Time It's Different - ChatGPT
From Theano To GPT4
When I first broke into the data world, tools like Tensorflow, Caffe, and Theano were growing in popularity.
In addition, I had taken several courses in neuroscience, bioinformatics, and computational neuroscience.
Culminating in immediate interest in the subject of neural networks and machine learning. Now a lot…and I mean a lot has changed since when I first even heard of these topics.
Each passing year brings further advancements and changes in what we think and know is possible using AI and ML.
In this article, I wanted to review one of my older articles where I discussed some of the foundational research that built up our understanding around some of these concepts in computer vision as well as discuss ChatGPT.
It’s hard not to jump on the hype train…
From Neuroscience To Computer Vision
Like many students in college, I spent a good amount of time taking electives that had nothing to do with my major. In particular, I decided to take a few random neuroscience courses where we read through papers and discussed their impact on our understanding of how the brain translates and interrupts signals.
Eventually when I did start writing on data, one of the first articles I put out reviewed these papers and how they built upon each other and led to computer vision as we know it now(now being 2016).
The first paper I went through was from Hubel and Wiesel (which is where the image comes from above) who discovered that specific neurons in the human brain detected specific angles of lines of light.
To steal from a much younger Seattle Data Guy. Here is a quick summary of what exactly this paper discussed.
They brought us the original snap, crackle, and pop — and no, not the cereal. By connecting an electrode to a neuron, they were able to listen to the neuron respond to the stimulus of a bar of light.
They opened up a new understanding of how the V1 cortical neurons operated, and it was mind-blowing. Their research helped lay out the mapping and function of neurons in the V1.
These cell reactions, when combined, were theorized to somehow be able to create a bottom-up image of the natural world. That is to say, by taking the response of many neurons responding to various bars of light, the human brain begins to draw a picture of the world around it.
Fast forward a little over a decade and you’d find an early implementation of a neural network. This was put together by Olshausen and D J Field.
In their research, instead of just focusing on single bars of lights, their team took pictures and started to look at how algorithms could possibly recognize and code features inside of said images.
One of their papers is called “Natural Image Statistics and Efficient Coding”
Another pull from a younger me.
The purpose of this paper was to discuss the failures of Hebbian learning models in image recognition, specifically, the Hebbian learning algorithms that utilized principal component analysis. The issue is that the models could not at the time learn the localized, oriented, bandpass structures that make up natural images.
This is theoretically a few layers up from what Hubel and Wiesel started to demonstrate in their research with real neurons. Except now, they were modeling the output of 192 units (or nodes) in a more modern neural network. Their research showed how developing models that focused more on sparseness when it came to coding the regularities that exist in natural images was far more effective.
At this point, we had gone from showing how V1 cortical neurons could detect lines of light at specific angles to electronic neurons being able to detect patterns.
Of course, my favorite part of this paper was the final line that stated “An important and exciting future challenge will be to extrapolate these principles into higher cortical visual areas to provide predictions.”
They knew they had built a foundation that the rest of computer vision could rely on but they also realized they were just starting(I am sure there are dozens of similar papers).
Fast Forward To Today
Today…Theano is rarely discussed and it feels as if there are far fewer “build a neural network over your lunch break” posts on Medium.
Instead, the discussion is around LLMs and ChatGPT.
It’ll write your essays.
It’ll write your code.
It’ll make you obsolete.
It’ll also make far more spam on the internet than ever before. For example, the TikTok below is meant to provide a way to write more meaningful messages to people.
Enable 3rd party cookies or use another browser
But I don’t see the example above as a benefit of LLMs.
The example of use cases that have made more sense to me in the LLM world is finding the signal in the noise.
That is to say, if what we use LLMs for is to automate sending out pestering emails and DMs to individuals who already have hundreds if not thousands of emails coming at them, well that doesn’t seem helpful.
All that will lead to is another model on the other side that filters out said messages to only allow those of interest to make it to the person on the other side.
An arms race of LLMs if you will.
Overall there are plenty of benefits ChatGPT and other models can provide, but I’d like to talk about some limitations I have either experienced or foresee.
Who Will Feed The Models - Human Incentive
In our current model of content creation, many people are paid to write content from Google’s ad sense as well as other forms of sponsorship. Thus, there is an incentive to do work.
If we remove payment for this work.
No one will write content, and thus there will be no new information for ChatGPT to take in. There is already a limitation in terms of ChatGPT only using data from a certain period, what if no one creates new content?
Where will new data come from?
News companies already have a hard enough time making a profit. Why would they put up content if products like ChatGPT will just swallow it up and give no credit(maybe that will change with the new GPT plugins)?
After all, we are human and there needs to be a human incentive to do work.
Missing That Empathetic Touch
Another issue I noticed came out recently when I was working on a project with another person who decided to use an LLM to help write a section of a report.
Reading through said section, I felt like something was off. It came out so dry and was missing that connection piece that you’d expect from a deep dive or analysis.
It made me think about when I worked withon a piece a while back. He kept asking me to add more personal references and examples and not just bluntly say “this is what a data engineer is”.We are human after all. We connect via stories and examples we can relate to. Anyone can give advice, but I believe there is a need for us, as humans, to have some emotional connection for said advice to truly resonate and drive action.
I am not saying this can’t be added to ChatGPT. I imagine in the future, if individuals took great notes and then fed them into a model, then they’d likely be able to put together something more personal.
In the end, I do believe LLMs being used in more curated methods could provide a lot of benefits.
SeattleDataGuy’s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
Distilling Value From Company Data
One of the areas where I do see LLMs making a lot of sense is inside of companies.
Companies have so much information.
They have data and information on best practices, code bases, analytics processes, research done in the past, etc.
This provides an opportunity to create tools and solutions that can help distill and elicit value as well as improve efficiency.
For example, in terms of something we data engineers and analysts can relate to. Imagine if your company allowed a model such as the one being used by GitHub Copilot to process all past queries that relate to the current data model.
This might allow analysts and engineers the ability to write far less code. If you’re writing a query that already exists, it could either complete it or recommend you use a specific view that is similar to what you need. To an extent, it could be a massive extension of a data catalog. Providing context inline.
In addition, it might allow for the reduction of ad-hoc queries by allowing directors and other individuals the ability to ask for a CSV based on a common query to be sent to them exactly when they need it.
Of course, this is just one example and many more have been shown off by Microsoft over the past few weeks.
But let’s take it from B2B to B2C.
One of the areas that people are discussing a lot about recently is the fact that ChatGPT-4 has passed several standard exams including the LSAT, GRE and SAT.
At least in the short-term, I imagine that actually provides an interesting opportunity to help students outsmart these exams.
Perhaps someone can create a solution that you can provide your past test results and answers in and the model could help coach you on how to outsmart these exams and when I say outsmart…I mean that pretty literally.
I imagine some of these exams are pattern heavy(sorry I got away with only having to take the GRE so I don't have a lot of personal experience).
The point is these exams have a standard set of questions so perhaps with the right data a model could be fine-tuned to help understand the limitations of the students it is tutoring and improve their output.
Likely even better than a tutor could since it could have access to so much more information.
But then again, maybe it’ll just miss that human touch.
Truthfully, I need to dig into what tools like ChatGPT can do more throughly(which I plan to do) to fully understand it’s impact.
One point of interest is that it looks like ChatGPT can pass the SAT evidence and reading section but it flunked the AP english language and composition sections.
So I’d love to see if there is a reason why and genuinely ask, is it different this time?
If you haven’t subscribed, then consider doing so as I will be doing some Youtube lives in the next few weeks on this subject. I plan to talk to people such as Ajay Halthor who has been digging into the working on neural networks and LLMs far more than me, so don’t miss out!
Data Events You’re Not Going To Want To Miss!
In-person labs - Build Intelligent Data Pipelines with Ascend.io in Columbus, NYC, Seattle, & London
Virtual - How Does ChatGPT Work With Ajay Halthor
Virtual - Data Leaders’ Top Security Trends for 2023
Articles Worth Reading
There are 20,000 new articles posted on Medium daily and that’s just Medium! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!
Building a Media Understanding Platform for ML Innovations
Netflix leverages machine learning to create the best media for our members. Earlier we shared the details of one of these algorithms, introduced how our platform team is evolving the media-specific machine learning ecosystem, and discussed how data from these algorithms gets stored in our annotation service.
Much of the ML literature focuses on model training, evaluation, and scoring. In this post, we will explore an understudied aspect of the ML lifecycle: integration of model outputs into applications.
AI versus Common Sense
I’ve heard ‘move fast and break things’ for over a decade. This phrase is a pseudo-rallying call in the tech community. Creating digital processes, systems, products and platforms quickly and figuring out that they don’t work has become a badge of honor. The prevailing standard operating theory is that if we know what doesn’t work, then we’re closer to determining what does work and all the breaking we’ve done is somehow worth it.
End Of Day 76
Thanks for checking out our community. We put out 3-4 Newsletters a week discussing data, tech, and start-ups.