Neuroevolution is about breaking down all neural network algorithms to their most basic constituent matrix multiplication operations, as well as the architecture and relationships defining the flow of loss derivatives.
Then, defining those all as codons such that CNN’s, resnet, ViT, self-attention blocks, lstm, Swin, transformers, etc are all just specific gene sequences.
You have built back up all the neural networks, and define each one as a sequence of codons representing the matrix math, hyperparameters, neural architecture, etc.
Then, you can “mate” and “mutate” the neural networks with each other, letting new hybrid networks and mutant networks be defined.
You set up competitions on datasets, and let the neural networks compete on cross-validation accuracy.
Survival of the fittest death matches.
The neural networks that don’t do well die out and can’t pass on their genes to the next generation of hybrids and mutants.
You repeat this over many generations until new neural network architectures emerge which blow the competition out of the water, putting even the fabled transformers to shame.
The key part which requires human intelligence is how to break down the neural networks into a sufficiently large pool of codons such that truly novel things can emerge. The human must deeply understand every addition, every matrix multiplication, every new idea that’s come out in neural network architecture and make sure the codons could theoretically be used to define that neural network.
If the search space is large enough, then it’s like real life evolution wherein the initial seed of the universe contained within it every possible combination DNA, including future species.
But I digress. Basically genetic algorithms in neural architecture search are about letting semi-random architectures compete and evolve over many generations of hybrids and mutants, interbreeding and competing, until novel highly accurate neural networks are discovered.
Then, defining those all as codons such that CNN’s, resnet, ViT, self-attention blocks, lstm, Swin, transformers, etc are all just specific gene sequences.
You have built back up all the neural networks, and define each one as a sequence of codons representing the matrix math, hyperparameters, neural architecture, etc.
Then, you can “mate” and “mutate” the neural networks with each other, letting new hybrid networks and mutant networks be defined.
You set up competitions on datasets, and let the neural networks compete on cross-validation accuracy.
Survival of the fittest death matches.
The neural networks that don’t do well die out and can’t pass on their genes to the next generation of hybrids and mutants.
You repeat this over many generations until new neural network architectures emerge which blow the competition out of the water, putting even the fabled transformers to shame.
The key part which requires human intelligence is how to break down the neural networks into a sufficiently large pool of codons such that truly novel things can emerge. The human must deeply understand every addition, every matrix multiplication, every new idea that’s come out in neural network architecture and make sure the codons could theoretically be used to define that neural network.
If the search space is large enough, then it’s like real life evolution wherein the initial seed of the universe contained within it every possible combination DNA, including future species.
But I digress. Basically genetic algorithms in neural architecture search are about letting semi-random architectures compete and evolve over many generations of hybrids and mutants, interbreeding and competing, until novel highly accurate neural networks are discovered.
Society is quite far from a meritocracy, but the problem is we seem to be moving away from it not towards it.
The momentum from past generations means nepotism and “who you know” becomes more important than your competence to some degree.
If you’re at the top of your field then you’ll ride to the top, but for 99% of people competence and talent and skill are not valued. Knowing someone who can give you a warm intro, being at events around those whose associations would boost your status in society, and a thousand other subconscious factors means that life is not as meritocratic as it could be.
Yet ideologies like DEI, and affirmative action, and gender quotas, and talks of privilege or glass ceilings, or correcting for systemic bias, or reducing standardized testing actually moves us further away from a meritocracy.
We should be valuing the competent, whether through talent or hard work, and fighting against any idea of forced quotas or giving people a leg up.
Society should be focusing on equality of opportunity, but instead stupidly seems to focusing on equality of outcome.
The momentum from past generations means nepotism and “who you know” becomes more important than your competence to some degree.
If you’re at the top of your field then you’ll ride to the top, but for 99% of people competence and talent and skill are not valued. Knowing someone who can give you a warm intro, being at events around those whose associations would boost your status in society, and a thousand other subconscious factors means that life is not as meritocratic as it could be.
Yet ideologies like DEI, and affirmative action, and gender quotas, and talks of privilege or glass ceilings, or correcting for systemic bias, or reducing standardized testing actually moves us further away from a meritocracy.
We should be valuing the competent, whether through talent or hard work, and fighting against any idea of forced quotas or giving people a leg up.
Society should be focusing on equality of opportunity, but instead stupidly seems to focusing on equality of outcome.
I feel like a lot of big AI companies’ secret sauce is how they handle the infrastructure of distributed models and distributed data such that you can horizontally scale to any number of GPU’s. Neural networks are open source, data is somewhat a competitive advantage, but just having the raw engineering skills to actually scale to neural networks with over a billion parameters and N=billion datasets. At that scale you start dealing with annoying issues like obscure cuda or hardware issues, segfaults because of literally excess heat, hardware dying during a training run, and a bunch of other considerations that are never dealt with smaller AI. But that has nothing to do with your ability to come up with clever neural architectures and actually push the boundaries on what network connections and transformations and KQV attention maps learn best, it’s purely raw engineering skill - can you make this machine function or not. That’s a huge factor in training larger AI models.
This is how 1982 envisioned 2019 looking (in LA). Somewhere along the way, we had a weakness in the species that prevented us from hitting our sci-fi potential.
I’m guessing it’s the limbic system response to politics. We were promised hoverboards and glorious monstrosities… But also it’s fiction and real engineering is hard. Sci-fi is always ahead of reality because anything can happen in our imaginations.
The interplay between sci-fi, culture, and engineering will likely be studied in the future when try to optimize the rate of innovation, and we realize that most engineers take their inspiration from sci-fi either directly or via cultural osmosis.
I’m guessing it’s the limbic system response to politics. We were promised hoverboards and glorious monstrosities… But also it’s fiction and real engineering is hard. Sci-fi is always ahead of reality because anything can happen in our imaginations.
The interplay between sci-fi, culture, and engineering will likely be studied in the future when try to optimize the rate of innovation, and we realize that most engineers take their inspiration from sci-fi either directly or via cultural osmosis.
This media is not supported in your browser
VIEW IN TELEGRAM
Sophisticated LLM’s could theoretically be simulated by a sufficient number of gears and cogs.
“Tell me you don’t understand computers without telling me you don’t understanding computers” but I’m right.
Consciousness in such a mechanical machine could maybe emerge from a network effect, if it turns out that consciousness emerges from the network effect of a sufficient number of neurons transferring information on patterns.
However, consciousness may not be emergent but rather innate to organic chemistry, and not innate to cogs and rocks with electricity pumped through them in a certain way.
I personally know that my body and brain is divine and more than mechanical gears, but perhaps a sufficient number of networked gears could say the same thing?
“Tell me you don’t understand computers without telling me you don’t understanding computers” but I’m right.
Consciousness in such a mechanical machine could maybe emerge from a network effect, if it turns out that consciousness emerges from the network effect of a sufficient number of neurons transferring information on patterns.
However, consciousness may not be emergent but rather innate to organic chemistry, and not innate to cogs and rocks with electricity pumped through them in a certain way.
I personally know that my body and brain is divine and more than mechanical gears, but perhaps a sufficient number of networked gears could say the same thing?
Practice using LLM’s like GPT-4 as the initial attempt to write anything, from code to emails to reports to copywriting.
The idea is not to presume it will be better or faster than you doing it yourself; it’s to practice.
You need many sessions for your neurons to snap into place with how to use them, to get a fingertip feel regarding their capabilities and limitations, and, most importantly, improve your prompting skills.
Different situations require different ways to prompt the LLM. To make an email professional yet unique, to write modular code, to be persuasive in copy, all require you to have a different type of conversation with the LLM.
And if you’ve studied flowstate (the original book Flow from the early 90s is great, as is the book Peak), then you know you want to be slightly outside your comfort zone until you master it, and then systematically increase your comfort zone. There is a scientifically “better” way to acquire new skills.
And to systematically improve yourself at the craft of LLM prompting by forcing yourself to use it for everything, will put you leaps and bounds ahead of the competition.
The world of tomorrow will very clearly be highly LLM-driven, for its tentacles will seep into the deepest crevices of society.
By intentionally practicing now and making yourself a wizard at LLM prompting, employers will demand you, founders will need you, and you’ll get more sex. Okay maybe not that last one. But maybe 🤔
Either way, get good.
🤖
The idea is not to presume it will be better or faster than you doing it yourself; it’s to practice.
You need many sessions for your neurons to snap into place with how to use them, to get a fingertip feel regarding their capabilities and limitations, and, most importantly, improve your prompting skills.
Different situations require different ways to prompt the LLM. To make an email professional yet unique, to write modular code, to be persuasive in copy, all require you to have a different type of conversation with the LLM.
And if you’ve studied flowstate (the original book Flow from the early 90s is great, as is the book Peak), then you know you want to be slightly outside your comfort zone until you master it, and then systematically increase your comfort zone. There is a scientifically “better” way to acquire new skills.
And to systematically improve yourself at the craft of LLM prompting by forcing yourself to use it for everything, will put you leaps and bounds ahead of the competition.
The world of tomorrow will very clearly be highly LLM-driven, for its tentacles will seep into the deepest crevices of society.
By intentionally practicing now and making yourself a wizard at LLM prompting, employers will demand you, founders will need you, and you’ll get more sex. Okay maybe not that last one. But maybe 🤔
Either way, get good.
🤖
Relax anon, we’re gonna be fine. Seriously.
It’s gonna be a bizarre and disturbing new world for sure in some dimensions.
People having emotional relationships with chatbots, easy access by bad actors to knowledge on how to build weaponry, and the chance for AI that passes all the IQ tests we can throw at it not behaving as we expect (will it treat us like the way we treat cattle because we’re an order of magnitude smarter than a bovine!?!)
It’s sci-fi, and already beaten to death in a thousand sci-fi books and movies and shows and comics and anime’s.
But sure, it could happen. Maybe one in a quadrillion chance, and I literally just made that number up. 10^-1 or 10^-100 chance? We don’t know, it’s all just a guess. (Don’t even get me started on people stupidly making up numbers out of thin air like “20% chance of extinction” cuz vibes).
We’re already imagining the worst, as our anxiety-prone survival instinct deals with the unknown, and maps out various scenarios of doom via our overactive imaginations.
And oh boy are power-hungry regulators and control-loving bureaucrats frothing at the mouth, salivating at all this fear and anxiety swirling around the AI discussion - plenty of them just want to have a seat at the table for their own careers, they don’t actually care deep down, it’s just posturing and trying to think which “experts” will be best to work with politically.
And sure, cataclysms have happened before to our species. Icarus knows what’s up. Not dismissing bad shit happening.
But then you look up, it’s 2050, and a child is learning advanced physics because an AI systematically optimized the way the information was taught, presented, and evaluated in real time.
You see not just advanced weaponry but also advanced medicine: people living longer and better as cancer is diagnosed earlier and treatments are personalized based on your individual DNA, health records, blood work and imaging; essentially, custom designer drugs built especially for you.
Sure you have AI controlled drone swarms bombing poor villages for oil, but you have glorious beautiful green buildings being built and then maintained by those same AI controlled drone swarms. It depends on the operator.
It’s possible might have an all-powerful Zordon that decides, against your personal wishes, that you should not reproduce; but you also might have an entirely automated space ringworld farm that provides crops, meat, dairy, fruit, and vegetables to a growing population, optimally distributed to feed those who need it the most, first.
But that’s all sci-fi too.
Sure you can imagine an AI that makes humans go extinct because we recklessly built something that went off the rails and, accidentally or intentionally, destroyed Earth. But you can also imagine an AI that gives you an extra few years or even decades with an aging parent. It’s imagination.
What’s the takeaway from my perspective?
The issues it causes will likely be black swan, maybe new mental diseases of AI networked hive minds, idk. I can imagine lots of things, both good and bad, but let’s look at what it’s doing today, what’s really going to happen in actual banal physical reality, and let’s use it as a decentralized open source tool. Let’s optimize health and fitness and glorious smart cities.
Focus your passions to building positively with it, and avoid getting trapped in its digital snares and losing your soul.
It’s not gonna extinct us, relax - humans are wildly anti-fragile.
The future of humans with our tech tools has always been symbiotic, and with symbiosis each party needs the other party.
Harmony, folks.
Back to building medical AI for me personally.
😎🫡🤖🦾
It’s gonna be a bizarre and disturbing new world for sure in some dimensions.
People having emotional relationships with chatbots, easy access by bad actors to knowledge on how to build weaponry, and the chance for AI that passes all the IQ tests we can throw at it not behaving as we expect (will it treat us like the way we treat cattle because we’re an order of magnitude smarter than a bovine!?!)
It’s sci-fi, and already beaten to death in a thousand sci-fi books and movies and shows and comics and anime’s.
But sure, it could happen. Maybe one in a quadrillion chance, and I literally just made that number up. 10^-1 or 10^-100 chance? We don’t know, it’s all just a guess. (Don’t even get me started on people stupidly making up numbers out of thin air like “20% chance of extinction” cuz vibes).
We’re already imagining the worst, as our anxiety-prone survival instinct deals with the unknown, and maps out various scenarios of doom via our overactive imaginations.
And oh boy are power-hungry regulators and control-loving bureaucrats frothing at the mouth, salivating at all this fear and anxiety swirling around the AI discussion - plenty of them just want to have a seat at the table for their own careers, they don’t actually care deep down, it’s just posturing and trying to think which “experts” will be best to work with politically.
And sure, cataclysms have happened before to our species. Icarus knows what’s up. Not dismissing bad shit happening.
But then you look up, it’s 2050, and a child is learning advanced physics because an AI systematically optimized the way the information was taught, presented, and evaluated in real time.
You see not just advanced weaponry but also advanced medicine: people living longer and better as cancer is diagnosed earlier and treatments are personalized based on your individual DNA, health records, blood work and imaging; essentially, custom designer drugs built especially for you.
Sure you have AI controlled drone swarms bombing poor villages for oil, but you have glorious beautiful green buildings being built and then maintained by those same AI controlled drone swarms. It depends on the operator.
It’s possible might have an all-powerful Zordon that decides, against your personal wishes, that you should not reproduce; but you also might have an entirely automated space ringworld farm that provides crops, meat, dairy, fruit, and vegetables to a growing population, optimally distributed to feed those who need it the most, first.
But that’s all sci-fi too.
Sure you can imagine an AI that makes humans go extinct because we recklessly built something that went off the rails and, accidentally or intentionally, destroyed Earth. But you can also imagine an AI that gives you an extra few years or even decades with an aging parent. It’s imagination.
What’s the takeaway from my perspective?
The issues it causes will likely be black swan, maybe new mental diseases of AI networked hive minds, idk. I can imagine lots of things, both good and bad, but let’s look at what it’s doing today, what’s really going to happen in actual banal physical reality, and let’s use it as a decentralized open source tool. Let’s optimize health and fitness and glorious smart cities.
Focus your passions to building positively with it, and avoid getting trapped in its digital snares and losing your soul.
It’s not gonna extinct us, relax - humans are wildly anti-fragile.
The future of humans with our tech tools has always been symbiotic, and with symbiosis each party needs the other party.
Harmony, folks.
Back to building medical AI for me personally.
😎🫡🤖🦾
Did you know computers in the 1900s evolved from vote counting machines in the 1800s which evolved from punchcards for beautiful loom machine design patterns in the 1700s?
I’m a father as of this week!
I have a wonderful son, and will do all in my power to build a better world for him tomorrow.
The next generation needs us.
Uproot the rot of corruption, and flood the future with beauty and love and magnificence.
I have a wonderful son, and will do all in my power to build a better world for him tomorrow.
The next generation needs us.
Uproot the rot of corruption, and flood the future with beauty and love and magnificence.
Doomerism is prevalent because humans have a negativity bias, meant to ensure our survival.
Our physiological reactions of fear and anxiety during a stressful zoom meeting evolved on the African savanna to be wary of lions, where the danger was real and acute and we needed to get our body into fight or flight mode.
Now it’s disproportionate to the actual risk of danger.
Our lower brain functions are hardwired to be a bit more scared than hopeful, a bit more pessimistic than optimistic.
This keeps us alive and is evolutionarily advantageous.
But really it’s about the anxiety we feel is often disproportionate to the risk at hand.
At a mass scale, this means we will find it easier and, to our primitive brainstem and limbic system, more rational, to see all the ways things can go wrong instead of how they’ll go right.
In actuality, reality is often much more banal than either extreme.
Doomers gonna doom.
It’s irrational to avoid risk, but doomerism is often an emotional reaction masquerading as rational; how many people are actually in touch with their limbic systems?
Our physiological reactions of fear and anxiety during a stressful zoom meeting evolved on the African savanna to be wary of lions, where the danger was real and acute and we needed to get our body into fight or flight mode.
Now it’s disproportionate to the actual risk of danger.
Our lower brain functions are hardwired to be a bit more scared than hopeful, a bit more pessimistic than optimistic.
This keeps us alive and is evolutionarily advantageous.
But really it’s about the anxiety we feel is often disproportionate to the risk at hand.
At a mass scale, this means we will find it easier and, to our primitive brainstem and limbic system, more rational, to see all the ways things can go wrong instead of how they’ll go right.
In actuality, reality is often much more banal than either extreme.
Doomers gonna doom.
It’s irrational to avoid risk, but doomerism is often an emotional reaction masquerading as rational; how many people are actually in touch with their limbic systems?
Decentralized machine intelligence is one of the most important issues of our time.
Throughout history, those in power have always feared the plebeians getting access to knowledge.
From the Middle Ages Catholic papacy fearing what would happen when the plebs got access to knowledge via the printing press…
To African American slaves being denied books and forced to stay illiterate…
To communist regimes preventing access to ideas that go against official doctrine…
The idea of getting extremely advanced and knowledgeable AI in the hands of the commoners is something that irks a certain type of person in power who fears their power being eroded.
This is wrapped up by the lazy thinking of “bUT wHaT if CRimiNAls gET iT!?”
It doesn’t matter, keeping AI centralized “for the greater good” is an excuse for tyranny and control.
Let it fly.
Make AI decentralized.
Give the most advanced algorithms to everyone.
Let’s stumble and then learn to crawl and then learn to walk and then learn to run and then learn to fly with the lost advanced AI in the hands of everyone.
It will be a net positive, mark my words.
I fear centralized tyranny way more than the dangers of decentralized AI.
Give us plebs the tools to elevate ourselves, and see people who want to centralize the power of machine intelligence as the enemy, whether misguided or evil.
Decentralized AI ftw.
Throughout history, those in power have always feared the plebeians getting access to knowledge.
From the Middle Ages Catholic papacy fearing what would happen when the plebs got access to knowledge via the printing press…
To African American slaves being denied books and forced to stay illiterate…
To communist regimes preventing access to ideas that go against official doctrine…
The idea of getting extremely advanced and knowledgeable AI in the hands of the commoners is something that irks a certain type of person in power who fears their power being eroded.
This is wrapped up by the lazy thinking of “bUT wHaT if CRimiNAls gET iT!?”
It doesn’t matter, keeping AI centralized “for the greater good” is an excuse for tyranny and control.
Let it fly.
Make AI decentralized.
Give the most advanced algorithms to everyone.
Let’s stumble and then learn to crawl and then learn to walk and then learn to run and then learn to fly with the lost advanced AI in the hands of everyone.
It will be a net positive, mark my words.
I fear centralized tyranny way more than the dangers of decentralized AI.
Give us plebs the tools to elevate ourselves, and see people who want to centralize the power of machine intelligence as the enemy, whether misguided or evil.
Decentralized AI ftw.
The argument against closed source AI architectures at private corps is that the researchers who build them are just people who will eventually switch to other jobs, and an NDA can’t really stop the insights and methodologies from spreading between companies, making some IP moot.
One risk of using LLM’s is lazy thinking.
I almost typed in ChatGPT4, “how would I have a webapp call an api to start an ecr docker job asynchronously and notify the user when the job is finished?”
Then I realized I already know the answer to that in my head, I already know the code to do that, the infrastructure, and asking GPT to just give me the answer can promote lazy thinking and atrophy skills, even if it’s faster and more productive.
However, maybe “coding in English” is the way of the future and we should let some skills go rusty and embrace the change. And if it means a more productive work session to just ask some LLM to write the code for me and I just check it’s work, then who am I to argue with results?
To that point, I’d say it’s like practicing mental math in your head (e.g. multiplying two two-digit numbers) instead of relying on a calculator, and if little things like that are needed to be the smartest we can be despite the ease of use of the tools around us.
I almost typed in ChatGPT4, “how would I have a webapp call an api to start an ecr docker job asynchronously and notify the user when the job is finished?”
Then I realized I already know the answer to that in my head, I already know the code to do that, the infrastructure, and asking GPT to just give me the answer can promote lazy thinking and atrophy skills, even if it’s faster and more productive.
However, maybe “coding in English” is the way of the future and we should let some skills go rusty and embrace the change. And if it means a more productive work session to just ask some LLM to write the code for me and I just check it’s work, then who am I to argue with results?
To that point, I’d say it’s like practicing mental math in your head (e.g. multiplying two two-digit numbers) instead of relying on a calculator, and if little things like that are needed to be the smartest we can be despite the ease of use of the tools around us.
Every tech advancement has only displaced, not replaced, jobs en-masse.
Did digital computers replace the jobs of the “computer” women of the early 1900s who did rote calculations by hand? Yes.
Did advanced phone tech replace the job of phone switch operators? Yes.
Did elevator tech replace the lift operators who used to operate elevators all day? Yes.
Did assembly line workers get replaced by automated robots? Yes.
Did bankers get replaced by online banking? Yes.
Did online shopping replace many retail cashier workers? Yes.
No industry is safe from tech advancements. However, tech has not been responsible for mass unemployment. Those individual workers may be screwed, but the labor force has moved to new jobs that get created from the tech advancements.
Displacement. not replacement, for the collective; replacement, sadly, for the individual.
Now I know the argument is that this time it’s different, that a sufficiently advanced AI can learn how to do new jobs faster than new jobs can be created. To that I just say maybe, but let’s just agree to disagree since it’s a guess anyway.
Did digital computers replace the jobs of the “computer” women of the early 1900s who did rote calculations by hand? Yes.
Did advanced phone tech replace the job of phone switch operators? Yes.
Did elevator tech replace the lift operators who used to operate elevators all day? Yes.
Did assembly line workers get replaced by automated robots? Yes.
Did bankers get replaced by online banking? Yes.
Did online shopping replace many retail cashier workers? Yes.
No industry is safe from tech advancements. However, tech has not been responsible for mass unemployment. Those individual workers may be screwed, but the labor force has moved to new jobs that get created from the tech advancements.
Displacement. not replacement, for the collective; replacement, sadly, for the individual.
Now I know the argument is that this time it’s different, that a sufficiently advanced AI can learn how to do new jobs faster than new jobs can be created. To that I just say maybe, but let’s just agree to disagree since it’s a guess anyway.
When I started doing machine learning work around 2007-ish, I learned it was often about handcrafted features, followed by dimensionality reduction, followed by a classifier.
Then deep learning changed the game in 2012.
Allow me to explain in the context of radiology 3D computer vision:
To have the computer predict if there was cancer in a section of an MRI scan, you’d extract a ~4 cm^3 patch, which in numpy terms might be a 64x64x7 array.
Then, you’d come up with some features from that patch based on edges and shapes and textures and Gabor wavelets (resulting in something like a 512x1 numpy array).
From here, you’d use a dimensionality technique to reduce it to a ~5x1 numpy array. PCA is the easiest but nonlinear ones like local linear embedding are better.
Finally, you’d do this for hundreds of patients’ cubic patches, with the label being cancer or not, or perhaps the stage or severity of cancer instead of a binary classifier. So X would be an array of shape 300x5, and Y would be 300x1.
You’d finally train a classifier like a support vector machine or random forest to predict which patches had cancer.
Then, in 2012, when AlexNet won the ImageNet by an enormous margin (from using GPU’s to train a neural network) and deep learning took off, everything changed.
Instead of extracting texture features, you had learnable convolution filters automatically hooked into a dimensionality reduction pipeline of layers attached to a classifier head all in one neural network package.
Instead of having each of those machine learning processes separate, all were parameterized and learned simultaneously via backpropagation.
So when those patches come in, the neural network like Swin or 3D Resnet extracts features and collapses them down to that 300x5 embedding vector latent space. It learns what parameters for the various transformations the data goes through will result in a 300x5 array that best separates cancer from not.
It learns this by tweaking the parameters of the inbuilt feature extractor and dimensionality reduction operations (well, it’s sort of moved beyond that with modern architectures but regardless).
It basically looks at a bunch of 3D 4 cm^3 patches and tweaks the parameters of the internal operations that architecture such that a 64x64x7 array fed in compresses down to a 5x1 vector (“embedding”) in such a way that best separates cancer from not (hot dog?).
A lot of engineers in industry are still using the older techniques and I think there’s a lot of room for them to compete on par with deep learning especially with algorithms like XGboost and potentially even neural network feature extractors.
In medical AI, there’s a lot of knowledge in doctors’ heads about what is cancer and what isn’t, and how the treatment is affecting the cancer, that can potentially be quantified and used in conjunction with the fully automatic feature extraction component of neural networks.
Regardless, I think about the way machine learning has progressed and also recognize that a lot of people in the tech industry are still using the older techniques in practice when developing new algorithms - not every new algorithm to solve a problem being developed is neural network based, and perhaps shouldn’t be if it’s not the best tool for the job.
Anyway, hope that was informative.
Then deep learning changed the game in 2012.
Allow me to explain in the context of radiology 3D computer vision:
To have the computer predict if there was cancer in a section of an MRI scan, you’d extract a ~4 cm^3 patch, which in numpy terms might be a 64x64x7 array.
Then, you’d come up with some features from that patch based on edges and shapes and textures and Gabor wavelets (resulting in something like a 512x1 numpy array).
From here, you’d use a dimensionality technique to reduce it to a ~5x1 numpy array. PCA is the easiest but nonlinear ones like local linear embedding are better.
Finally, you’d do this for hundreds of patients’ cubic patches, with the label being cancer or not, or perhaps the stage or severity of cancer instead of a binary classifier. So X would be an array of shape 300x5, and Y would be 300x1.
You’d finally train a classifier like a support vector machine or random forest to predict which patches had cancer.
Then, in 2012, when AlexNet won the ImageNet by an enormous margin (from using GPU’s to train a neural network) and deep learning took off, everything changed.
Instead of extracting texture features, you had learnable convolution filters automatically hooked into a dimensionality reduction pipeline of layers attached to a classifier head all in one neural network package.
Instead of having each of those machine learning processes separate, all were parameterized and learned simultaneously via backpropagation.
So when those patches come in, the neural network like Swin or 3D Resnet extracts features and collapses them down to that 300x5 embedding vector latent space. It learns what parameters for the various transformations the data goes through will result in a 300x5 array that best separates cancer from not.
It learns this by tweaking the parameters of the inbuilt feature extractor and dimensionality reduction operations (well, it’s sort of moved beyond that with modern architectures but regardless).
It basically looks at a bunch of 3D 4 cm^3 patches and tweaks the parameters of the internal operations that architecture such that a 64x64x7 array fed in compresses down to a 5x1 vector (“embedding”) in such a way that best separates cancer from not (hot dog?).
A lot of engineers in industry are still using the older techniques and I think there’s a lot of room for them to compete on par with deep learning especially with algorithms like XGboost and potentially even neural network feature extractors.
In medical AI, there’s a lot of knowledge in doctors’ heads about what is cancer and what isn’t, and how the treatment is affecting the cancer, that can potentially be quantified and used in conjunction with the fully automatic feature extraction component of neural networks.
Regardless, I think about the way machine learning has progressed and also recognize that a lot of people in the tech industry are still using the older techniques in practice when developing new algorithms - not every new algorithm to solve a problem being developed is neural network based, and perhaps shouldn’t be if it’s not the best tool for the job.
Anyway, hope that was informative.
With any incremental LLM improvement, like Claude3, I’m reminded that people really really want to believe machines are self aware and conscious, and thus project that belief into mechanical systems like LLM’s
It’s fun to watch as these models become more clever and create more accurate internal relationships between language data and the world being described.
Good to remember it’s still a bunch of matrix multiplications… (but perhaps that’s all organic intelligence is too?). LLM’s could be entirely simulated with enough mechanical gears and gates.
My personal opinion is that consciousness requires organic chemistry of biological neurons, and the internal world models emerging from LLM in order to better predict the next token are something emergent entirely different from sentience, and we have a category issue trying to fit the way machines process information to the way we do.
It’s fun to watch as these models become more clever and create more accurate internal relationships between language data and the world being described.
Good to remember it’s still a bunch of matrix multiplications… (but perhaps that’s all organic intelligence is too?). LLM’s could be entirely simulated with enough mechanical gears and gates.
My personal opinion is that consciousness requires organic chemistry of biological neurons, and the internal world models emerging from LLM in order to better predict the next token are something emergent entirely different from sentience, and we have a category issue trying to fit the way machines process information to the way we do.
I’m a very visual thinker.
I tend to think in terms of diagrams that I then use words to explicate.
I’ll imagine various future scenarios with some level of fidelity, and then perhaps visualize a timeline from today to then, and try to fill in the gaps in that timeline in my imagination.
I’ll envision something abstract like “society” as a net over the globe with various nodes and flows of people and cultures and institutions.
I’ll picture rows of data, or geometric relationships between concepts in my head.
I’ll see interconnected software modules as lego blocks to mix and match in our head.
Language is part of this, but Broca’s Area is only a small part of the human brain. And yet so is the visual cortex.
The human is layered and complex, and the way we “think” is a field in its infancy, while also being pondered by philosophers for millennia.
I tend to think in terms of diagrams that I then use words to explicate.
I’ll imagine various future scenarios with some level of fidelity, and then perhaps visualize a timeline from today to then, and try to fill in the gaps in that timeline in my imagination.
I’ll envision something abstract like “society” as a net over the globe with various nodes and flows of people and cultures and institutions.
I’ll picture rows of data, or geometric relationships between concepts in my head.
I’ll see interconnected software modules as lego blocks to mix and match in our head.
Language is part of this, but Broca’s Area is only a small part of the human brain. And yet so is the visual cortex.
The human is layered and complex, and the way we “think” is a field in its infancy, while also being pondered by philosophers for millennia.
There is an old hacker ethos that information should not be owned by any one person or organization, that information and intelligence and knowledge belong to the world. That capitalistic pigs are greedy and communistic pigs are power hungry, and hackers are technologists beyond both.
This speaks a lot to open source, open weights, open data.
Modern AI systems have finally reached critical mass where we are seeing branches in ideology regarding openness and licensing.
On one extreme, you have entirely closed-source AI companies like OpenAI and Google and Anthropic. These models dominate leaderboards and it’s interesting that open source still has not fully caught up.
Then you have Mistral, which has both closed source flagship models (to presumably make money) and then they open source their older models. And let’s be honest - would they have had even open sourced Mixtral if it wasn’t leaked?
Then you have Meta, which after the success of open-sourcing PyTorch and building a community of feedback and free labor (despite giving up IP secrets), decided to open source their Llama models… well, after the weights were leaked too. But they have the internal resources to train good weights on lots of data, and so the open source community is like a babe at the teet, waiting for Llama3 weights to drop. They also sometimes open their datasets.
Then you have Stability which opens everything except their training data.
And then you have attitudes like below, in which there is a real deep belief in the way open-source AI will spark a Cambrian explosion of progress, an enthusiastically embracing of open source code, open weights, open data, open everything for the benefit of us all, good and bad alike.
In the modern world of AI, with everyone on the internet from profiteers to zealous hackers to giant conglomerates to the actual open source coders, decisions that each person or company or organization makes about AI source code licensing is quite revealing as to their beliefs and helps shape the AI ecosystem. There’s a place for all.
We should consider what we believe about the various licensing of closed and open source code, weights, and data, and what each party’s incentives are with their AI licensing decisions.
This speaks a lot to open source, open weights, open data.
Modern AI systems have finally reached critical mass where we are seeing branches in ideology regarding openness and licensing.
On one extreme, you have entirely closed-source AI companies like OpenAI and Google and Anthropic. These models dominate leaderboards and it’s interesting that open source still has not fully caught up.
Then you have Mistral, which has both closed source flagship models (to presumably make money) and then they open source their older models. And let’s be honest - would they have had even open sourced Mixtral if it wasn’t leaked?
Then you have Meta, which after the success of open-sourcing PyTorch and building a community of feedback and free labor (despite giving up IP secrets), decided to open source their Llama models… well, after the weights were leaked too. But they have the internal resources to train good weights on lots of data, and so the open source community is like a babe at the teet, waiting for Llama3 weights to drop. They also sometimes open their datasets.
Then you have Stability which opens everything except their training data.
And then you have attitudes like below, in which there is a real deep belief in the way open-source AI will spark a Cambrian explosion of progress, an enthusiastically embracing of open source code, open weights, open data, open everything for the benefit of us all, good and bad alike.
In the modern world of AI, with everyone on the internet from profiteers to zealous hackers to giant conglomerates to the actual open source coders, decisions that each person or company or organization makes about AI source code licensing is quite revealing as to their beliefs and helps shape the AI ecosystem. There’s a place for all.
We should consider what we believe about the various licensing of closed and open source code, weights, and data, and what each party’s incentives are with their AI licensing decisions.
If you think about what simulating reality would consist of, think of the fact that by the time the central nervous system signals hit your brain, the brain trusts the signals coming in.
Can sensations like falling or pressure or heat or hunger be replicated and simulated so that by the time the nerve cells hit your brain, the brain just blindly assumes whatever signal was fed in, not realizing it can be manipulated?
Can sensations like falling or pressure or heat or hunger be replicated and simulated so that by the time the nerve cells hit your brain, the brain just blindly assumes whatever signal was fed in, not realizing it can be manipulated?
In a future galactic society, assuming Einstein’s and quantum theories hold, speed of light is going to be dictating the information flow. Information will be flowing to and from Earth, to and from relay stations years away, to and from solar systems centuries and millennia away.
The goings on, and the social theories, from Earth will take many years to propagate across the galaxy. And the information from the galaxy will take many years to propagate to Earth.
If you think about it like an octopus, the central head has the most neurons, but the tentacles all have their own set of neurons to deal with things locally.
Yet that’s a terra-centric view, assuming Earth is a centralized hub.
In actuality, it will be distributed and each node full of life will be as important as Earth.
So instead of an octopus model, we must consider the whole galaxy as a web where information is flowing across it at close to the speed of light.
This provides interesting questions regarding synchronization of code via git, or money via cryptocurrency. When merge conflicts occur, what happens and how long does that merge resolution take to propagate to the rest of the nodes?
In addition, information will likely be diluted and corrupted as it traverses over the years. A relay station four light years away will be actively receiving info from Earth lagged by four years, and information from the other direction also lagged by four years.
This will result in novel dynamics to explore, and entirely new ways to consider information which currently is trivial to synchronize.
It’s akin to centuries ago when information required honing pigeons or riders with written letters to spread information.
The goings on, and the social theories, from Earth will take many years to propagate across the galaxy. And the information from the galaxy will take many years to propagate to Earth.
If you think about it like an octopus, the central head has the most neurons, but the tentacles all have their own set of neurons to deal with things locally.
Yet that’s a terra-centric view, assuming Earth is a centralized hub.
In actuality, it will be distributed and each node full of life will be as important as Earth.
So instead of an octopus model, we must consider the whole galaxy as a web where information is flowing across it at close to the speed of light.
This provides interesting questions regarding synchronization of code via git, or money via cryptocurrency. When merge conflicts occur, what happens and how long does that merge resolution take to propagate to the rest of the nodes?
In addition, information will likely be diluted and corrupted as it traverses over the years. A relay station four light years away will be actively receiving info from Earth lagged by four years, and information from the other direction also lagged by four years.
This will result in novel dynamics to explore, and entirely new ways to consider information which currently is trivial to synchronize.
It’s akin to centuries ago when information required honing pigeons or riders with written letters to spread information.