Deeply Thrilling Telegrams

One risk of using LLM’s is lazy thinking.

I almost typed in ChatGPT4, “how would I have a webapp call an api to start an ecr docker job asynchronously and notify the user when the job is finished?”

Then I realized I already know the answer to that in my head, I already know the code to do that, the infrastructure, and asking GPT to just give me the answer can promote lazy thinking and atrophy skills, even if it’s faster and more productive.

However, maybe “coding in English” is the way of the future and we should let some skills go rusty and embrace the change. And if it means a more productive work session to just ask some LLM to write the code for me and I just check it’s work, then who am I to argue with results?

To that point, I’d say it’s like practicing mental math in your head (e.g. multiplying two two-digit numbers) instead of relying on a calculator, and if little things like that are needed to be the smartest we can be despite the ease of use of the tools around us.

1.64K views00:38

Every tech advancement has only displaced, not replaced, jobs en-masse.

Did digital computers replace the jobs of the “computer” women of the early 1900s who did rote calculations by hand? Yes.

Did advanced phone tech replace the job of phone switch operators? Yes.

Did elevator tech replace the lift operators who used to operate elevators all day? Yes.

Did assembly line workers get replaced by automated robots? Yes.

Did bankers get replaced by online banking? Yes.

Did online shopping replace many retail cashier workers? Yes.

No industry is safe from tech advancements. However, tech has not been responsible for mass unemployment. Those individual workers may be screwed, but the labor force has moved to new jobs that get created from the tech advancements.

Displacement. not replacement, for the collective; replacement, sadly, for the individual.

Now I know the argument is that this time it’s different, that a sufficiently advanced AI can learn how to do new jobs faster than new jobs can be created. To that I just say maybe, but let’s just agree to disagree since it’s a guess anyway.

1.81K views00:43

Deeply Thrilling Telegrams

When I started doing machine learning work around 2007-ish, I learned it was often about handcrafted features, followed by dimensionality reduction, followed by a classifier.

Then deep learning changed the game in 2012.

Allow me to explain in the context of radiology 3D computer vision:

To have the computer predict if there was cancer in a section of an MRI scan, you’d extract a ~4 cm^3 patch, which in numpy terms might be a 64x64x7 array.

Then, you’d come up with some features from that patch based on edges and shapes and textures and Gabor wavelets (resulting in something like a 512x1 numpy array).

From here, you’d use a dimensionality technique to reduce it to a ~5x1 numpy array. PCA is the easiest but nonlinear ones like local linear embedding are better.

Finally, you’d do this for hundreds of patients’ cubic patches, with the label being cancer or not, or perhaps the stage or severity of cancer instead of a binary classifier. So X would be an array of shape 300x5, and Y would be 300x1.

You’d finally train a classifier like a support vector machine or random forest to predict which patches had cancer.

Then, in 2012, when AlexNet won the ImageNet by an enormous margin (from using GPU’s to train a neural network) and deep learning took off, everything changed.

Instead of extracting texture features, you had learnable convolution filters automatically hooked into a dimensionality reduction pipeline of layers attached to a classifier head all in one neural network package.

Instead of having each of those machine learning processes separate, all were parameterized and learned simultaneously via backpropagation.

So when those patches come in, the neural network like Swin or 3D Resnet extracts features and collapses them down to that 300x5 embedding vector latent space. It learns what parameters for the various transformations the data goes through will result in a 300x5 array that best separates cancer from not.

It learns this by tweaking the parameters of the inbuilt feature extractor and dimensionality reduction operations (well, it’s sort of moved beyond that with modern architectures but regardless).

It basically looks at a bunch of 3D 4 cm^3 patches and tweaks the parameters of the internal operations that architecture such that a 64x64x7 array fed in compresses down to a 5x1 vector (“embedding”) in such a way that best separates cancer from not (hot dog?).

A lot of engineers in industry are still using the older techniques and I think there’s a lot of room for them to compete on par with deep learning especially with algorithms like XGboost and potentially even neural network feature extractors.

In medical AI, there’s a lot of knowledge in doctors’ heads about what is cancer and what isn’t, and how the treatment is affecting the cancer, that can potentially be quantified and used in conjunction with the fully automatic feature extraction component of neural networks.

Regardless, I think about the way machine learning has progressed and also recognize that a lot of people in the tech industry are still using the older techniques in practice when developing new algorithms - not every new algorithm to solve a problem being developed is neural network based, and perhaps shouldn’t be if it’s not the best tool for the job.

Anyway, hope that was informative.

1.8K views10:15

Deeply Thrilling Telegrams

With any incremental LLM improvement, like Claude3, I’m reminded that people really really want to believe machines are self aware and conscious, and thus project that belief into mechanical systems like LLM’s

It’s fun to watch as these models become more clever and create more accurate internal relationships between language data and the world being described.

Good to remember it’s still a bunch of matrix multiplications… (but perhaps that’s all organic intelligence is too?). LLM’s could be entirely simulated with enough mechanical gears and gates.

My personal opinion is that consciousness requires organic chemistry of biological neurons, and the internal world models emerging from LLM in order to better predict the next token are something emergent entirely different from sentience, and we have a category issue trying to fit the way machines process information to the way we do.

1.73K viewsedited 22:49

Deeply Thrilling Telegrams

I’m a very visual thinker.

I tend to think in terms of diagrams that I then use words to explicate.

I’ll imagine various future scenarios with some level of fidelity, and then perhaps visualize a timeline from today to then, and try to fill in the gaps in that timeline in my imagination.

I’ll envision something abstract like “society” as a net over the globe with various nodes and flows of people and cultures and institutions.

I’ll picture rows of data, or geometric relationships between concepts in my head.

I’ll see interconnected software modules as lego blocks to mix and match in our head.

Language is part of this, but Broca’s Area is only a small part of the human brain. And yet so is the visual cortex.

The human is layered and complex, and the way we “think” is a field in its infancy, while also being pondered by philosophers for millennia.

1.73K views23:08

Deeply Thrilling Telegrams

There is an old hacker ethos that information should not be owned by any one person or organization, that information and intelligence and knowledge belong to the world. That capitalistic pigs are greedy and communistic pigs are power hungry, and hackers are technologists beyond both.

This speaks a lot to open source, open weights, open data.

Modern AI systems have finally reached critical mass where we are seeing branches in ideology regarding openness and licensing.

On one extreme, you have entirely closed-source AI companies like OpenAI and Google and Anthropic. These models dominate leaderboards and it’s interesting that open source still has not fully caught up.

Then you have Mistral, which has both closed source flagship models (to presumably make money) and then they open source their older models. And let’s be honest - would they have had even open sourced Mixtral if it wasn’t leaked?

Then you have Meta, which after the success of open-sourcing PyTorch and building a community of feedback and free labor (despite giving up IP secrets), decided to open source their Llama models… well, after the weights were leaked too. But they have the internal resources to train good weights on lots of data, and so the open source community is like a babe at the teet, waiting for Llama3 weights to drop. They also sometimes open their datasets.

Then you have Stability which opens everything except their training data.

And then you have attitudes like below, in which there is a real deep belief in the way open-source AI will spark a Cambrian explosion of progress, an enthusiastically embracing of open source code, open weights, open data, open everything for the benefit of us all, good and bad alike.

In the modern world of AI, with everyone on the internet from profiteers to zealous hackers to giant conglomerates to the actual open source coders, decisions that each person or company or organization makes about AI source code licensing is quite revealing as to their beliefs and helps shape the AI ecosystem. There’s a place for all.

We should consider what we believe about the various licensing of closed and open source code, weights, and data, and what each party’s incentives are with their AI licensing decisions.

2K views02:02

Deeply Thrilling Telegrams

If you think about what simulating reality would consist of, think of the fact that by the time the central nervous system signals hit your brain, the brain trusts the signals coming in.

Can sensations like falling or pressure or heat or hunger be replicated and simulated so that by the time the nerve cells hit your brain, the brain just blindly assumes whatever signal was fed in, not realizing it can be manipulated?

1.48K views05:28

Deeply Thrilling Telegrams

In a future galactic society, assuming Einstein’s and quantum theories hold, speed of light is going to be dictating the information flow. Information will be flowing to and from Earth, to and from relay stations years away, to and from solar systems centuries and millennia away.

The goings on, and the social theories, from Earth will take many years to propagate across the galaxy. And the information from the galaxy will take many years to propagate to Earth.

If you think about it like an octopus, the central head has the most neurons, but the tentacles all have their own set of neurons to deal with things locally.

Yet that’s a terra-centric view, assuming Earth is a centralized hub.

In actuality, it will be distributed and each node full of life will be as important as Earth.

So instead of an octopus model, we must consider the whole galaxy as a web where information is flowing across it at close to the speed of light.

This provides interesting questions regarding synchronization of code via git, or money via cryptocurrency. When merge conflicts occur, what happens and how long does that merge resolution take to propagate to the rest of the nodes?

In addition, information will likely be diluted and corrupted as it traverses over the years. A relay station four light years away will be actively receiving info from Earth lagged by four years, and information from the other direction also lagged by four years.

This will result in novel dynamics to explore, and entirely new ways to consider information which currently is trivial to synchronize.

It’s akin to centuries ago when information required honing pigeons or riders with written letters to spread information.

1.35K views01:24

Deeply Thrilling Telegrams

Free speech protects all the loudmouthed protests and general public manipulation you want…

As long as you don’t cross over into violence.

There, your free speech rights end.

But the most interesting revolutionaries toed that line; they incited revolution, if short of violence.

But by moving the goalposts of what constitutes violence from physical bodily harm to psychological terror to hate speech to merely mean words… we must understand the line between free speech and inciting violence.

Interesting to see people try to find that line in this news cycle’s college protests, knowing that power always tries to push the limits.

What is allowed and what is forbidden by modern society?

Is it merely level of bodily harm and removal of consent?

1.24K views04:24

Deeply Thrilling Telegrams

Turning 37 in a few days.

Pleased with how my body’s aging.

Lifts going well.

Medical AI business going well.

Brain stimulated with books.

Husband to a beautiful caring wife, father to a growing son.

Missing dearly those family members like my dad who have passed away.

Staying in my lane.

Maybe this year I’ll finally get some grey hairs.

I sincerely hope you’re also flourishing amidst the chaos of the world.

I appreciate all my followers.

1.32K views18:45

Deeply Thrilling Telegrams

Spreading the good word of Perplexity since December 2023 like a good little missionary.

In December 2022, I spent all month telling everyone I knew about this new ChatGPT app, touting it as the latest and greatest tool for the regular person.

Almost nobody listened.

But they eventually realized that I was right and they should have listened to me earlier.

Now, since December 2023, I’ve been doing it again.

Few listened, as per usual, but those who did listen and are really familiar with how to best use perplexity for knowledge querying (5 months of practice adds up) are gaining edges everywhere in their lives.

Use perplexity - make it your first pass search for any question.

The key is to give it just a single sentence question ending in a question mark - don’t get fancy with prompts or context, get clear.

1.21K views06:09

Deeply Thrilling Telegrams

I’m getting jaded by LLM code generating ability.

It’s become so normalized for me in my daily coding workflow and flowstate that it’s going to take a lot to impress me.

It’s pretty a obvious line to me in my head what I can ask it to do and what I can’t ask it to do.

gpt4-o supposedly being 100 ELO points above any other model in coding ability means that it is better than gpt-4 on coding tasks 64% of the time, and worse than gpt-4 36% of the time.

I get excellent code generated for specific types of tasks, and it still struggles in the same types of requests for more complex tasks.

I wonder when there will be an LLM, (maybe gpt-5?) where I will feel like a whole new slew of things I can comfortably ask it to do has opened up a new vista of coding, like when I switched from coding raw to coding with GPT-3.5 and cursor.

That probably looks like an ELO difference compared to gpt-4 of at least 500 (ELO>1800), which corresponds to a model beating gpt-4 96% of the time, until I’ll feel like an order of magnitude difference in code generation has occurred.

Or maybe instead of such a discrete jump, the improvement will be gradual as the competition catches up for gpt4-o in code generation, the whole ecosystem of LLM’s inching each other out by a few ELO points every month?

Time will tell, but I can’t wait until the next jump in code-generation LLM occurs, whether gradual or all at once, for my own craft.

1.25K views04:01

Deeply Thrilling Telegrams

Keeping the most advanced AI centralized for the greater good is an exercise in tyranny & totalitarianism.

I fear the centralization of tyranny, the tyranny of centralization, way more than I fear potential hate speech or knowledge on advanced weaponry being put in front of plebs.

981 viewsedited 03:16

Deeply Thrilling Telegrams

While quantum entanglement can synchronize the spin of two particles light years apart, the reason you can’t send information faster than the speed of light is while you don’t know their spins, you know they’re synced.

My grad school quantum physics professor explained it like such:

Imagine you shuffled two decks of cards separately, and keep them hidden, face down.

Then, you rub them against each other, and magically (“quantum entanglement” according to physicists), the randomly shuffled orders are the same.

Now, you put each synchronized deck of cards in a separate spaceship and send each one to the opposite side of the galaxy.

When you turn over the Queen Of Hearts followed by the Seven Of Spades on one side of the galaxy, the that also shows up in the deck of cards (the spin of the entangled particle to escape the analogy) that’s because they were entangled.

But that’s also why you can’t send information faster than the speed of light via quantum entanglement as we understand it today - because while it’s miraculous that the entangled particles across the galaxy will collapse their quantum wave functions to the same specific quantum states, and those states show that knowledge stretches across time, no signal of information can pass across since the collapsed value is still randomly decided.

As in, you know the order of the cards is the same across the galaxy, but you can’t predict ahead of time what those cards will be, so you can’t send information faster than light via quantum entanglement.

1.25K views05:17

Deeply Thrilling Telegrams

When will we actually empirically look at government policies and determine if they were effective?

Did the Dept of Ed quantitatively improve education?

Did the Dept of Agriculture quantitatively improve our food supply?

Did the income tax improve our economy?

Did the environmental protection agency protect the environment?

I’m betting the answer isn’t all yes nor all no, but without sufficient quantitative metrics that decide if a policy continues, all policies seem to just settle into the sediment of bureaucracy, only to be spoken of as “this is just the way it’s done” with too much bureaucratic momentum to ever consider reversing, downscaling, or otherwise acting on ineffectual results.

Sad state of affairs tbh.

1.3K views04:23

Deeply Thrilling Telegrams

We’re spending hundreds of billions of dollars to feed in more sentences to LLM’s so they better predict the next language token, but comparatively spending mere pennies on analyzing the thought processes of typical humans during a typical day’s decision making. Retarded AI dev.

1.27K views04:23

Deeply Thrilling Telegrams

Do you realize how much easier space travel would be if we had FTL communication (not necessarily FTL travel)?

Quantum entanglement can’t help since you can’t control the resulting spin of the entangled particles once they’re observed so while you know they will end up with the same spin, even across the galaxy, you can’t control the spin and thus cannot send information between them.

Temporary micro-wormholes might work, but also would likely require a machine to travel to the other side before establishing a tunnel.

Hopefully AI helps us devise new physics (that’s falsifiable by experiment, the bedrock of true science) that can enable it.

Otherwise it’s gonna be interesting as different cultures, different markets, different currency rates, are all out of sync across the galaxy.

Can you image the git merge conflicts!?

995 views02:01

Deeply Thrilling Telegrams

This is the future we fight against: powerful AI under lock and key at a military base.

The best AI should be open for all.

“Open” as in open source, open weights, and maybe even open data if we’re lucky.

Not closed source but “free” (which is what OpenAI thinks the word “open” means).

And definitely not under lock and key to only be used by a few “trusted” persons.

The military controlling it because it’s too powerful for the regular pleb is just too dystopian and totalitarian.

They’ll use it against us all, such is the nature of power.

But the drooling powerhungry bureaucrats will couch it under the guise of “we can’t let those evil Chinese and Russian bad guys get it! They’ll steal our best ideas, build on top of it, and then never open up their super DUPER powerful AI!

I believe there’s a greater risk of corruption from centralized powerful AI from within the gates than from losing intellectual property to villains but that’s just me.

I say give it to everyone, all at once, and let the community build upon it. The old hacker ethos of “information belongs to the world, information should be free to all!” will prevail in the long run.

You just need one good magnet torrent link and it’s over.

1.17K views03:00

Deeply Thrilling Telegrams

Open-source, open-weights, open-data, decentralized AGI can help return the web to its roots.

1.07K views05:57

Deeply Thrilling Telegrams

So I use https://perplexity.ai before Google for any knowledge based question. It works better if you ask it questions in complete sentences.

Perplexity is an entirely better way to gain knowledge from the web over search engines.

It’s the killer app because it’s an abstraction atop the LLM’s - as in you can choose to use GPT-3.5 or GPT-4o or Claude Opus or Llama 3 as the backend summarizer.

But all it’s doing is searching the web, summarizing the results, and spitting the LLM-generated summary back out to you in paragraphs and bullet points and the like.

All this the free version of Perplexity provides.

Use it every day.

Pay the $20/month for the Pro version of Perplexity?

Regarding the underlying LLM being used to summarize your search results, how much would you want GPT-4 vs GPT-3.5 doing the summarization? Meh?

The second benefit to Pro is it does a step where it uses an LLM to “understand your question” first and then generate a few web search terms based on its preprocessing. Is this beneficial? Maybe?

The third benefit is early access to features like their new Pages, where you can do a bunch of your own queries and then turn the results into a Wikipedia-looking page on a topic that you can immediately share with other people to educate them.

So I pay for it, because I like advanced features, but imho you’re getting most of the real benefits (an order of magnitude better knowledge gathering experience) from the free version.

Use it.

1.22K views05:12

About

Blog

Apps

Platform