[sdiy] Chat GPT Image analysis.
brianw
brianw at audiobanshee.com
Wed Oct 18 21:46:35 CEST 2023
I don't want to get off the subject of synths, but since my impression of AI is closely related to my understanding of the mathematics behind convolution reverb, then I'll share my view. It's at least vaguely related to sound.
Impulse-based reverb plugins are probably familiar to folks here. The mathematics behind this is convolution. Any LTI (linear time invariant) system can be perfectly modeled by an impulse response. In other words, you can "train" your reverb plugin on Carnegie Hall, and then then you can run any piece of audio through the plugin and it will sound exactly like it was performed in Carnegie Hall. This "learning" extends to every possible space - caves, the Volkswagen New Beetle, a forest, anywhere - and it's even possible to synthesize a space with characteristics that you desire, even though that space does not exist anywhere in the real world. You can even model the body of a Stradivarius violin without being able to recreate the physical form. The point is that the impulse response is not the same as a physical space, but for the purposes of audio convolution can give the same effective result. To summarize, this class of reverb plugins is the combination of a class of mathematical functions (convolution) and various sets of data (impulse responses). The result sounds great, but it's not equivalent to building a physical space or instrument.
AI is just another class of mathematical functions. The training creates data for these functions. There are certainly differences, because the data is too large for someone to be able to easily synthesize an artificial set. In my mind it's the same as convolution with various impulse responses, but instead of audio input and audio output, it's maybe text in and image out, or image in and image out, or image in and text out. I suppose any set of data can be put into the system.
If you're not convinced yet of the limits of AI, then I have a few books to recommend that will put AI in the proper perspective:
Computer Power and Human Reason: From Judgement to Calculation (1976)
- by Joseph Weizenbaum
This one covers the ELIZA effect - we humans are hard-wired to anthropomorphize whenever possible
The Large, the Small, and the Human Mind (1997)
- by Roger Penrose
This one summarizes his two previous books, but with far fewer mathematical details
The Emperor's New Mind: Concerning Computers, Minds, and the Laws of Physics (1989)
and
Shadows of the Mind: A Search for the Missing Science of Consciousness (1994)
- both by Roger Penrose
These books discuss the differences between algorithmic calculation and non-algorithmic calculation; give the definition of a Turing Machine; explain how an algorithm that "learns" can never become more than an algorithm; discuss Gödel's Incompleteness Theorem and how there are mathematical proofs that humans understand but which can never be completed by a computer analysis. These books are way over my head, but the one thing that I learned is that it's far too easy for the press to simplify AI using terms like "learning" to make it sound like more than just math is going on, and yet mathematicians understand that there are different kinds of rules and proofs, just like there is more than one kind of "infinity."
Although the publication dates on these books go back decades, the math and psychology has not actually changed.
Brian
On Oct 18, 2023, at 11:23 AM, Quincas Moreira via Synth-diy <synth-diy at synth-diy.org> wrote:
> Interesting, but my experience is not that it continues my sentences, rather that it replies to my queries with very useful information and ideas. I know nothing is original and it’s derived from farming existing information, but the result is far more interesting, engaging, useful than simple predictive text. And it has already learned language it was not trained on, etc. I’m not scared of it, but I’m intrigued and interested, it’s neural network and seems to be evolving beyond what even its programmers expected.
>
> On Wed 18 Oct 2023 at 11:43 cheater cheater via Synth-diy <synth-diy at synth-diy.org> wrote:
>> > I have to say I am very excited to see where this GPT thing goes but also a little frightened by it.
>>
>> most people who say they are either excited or frightened by GPT say
>> that because they are mystified by the software, in turn because they
>> don't know what it does. So let me give you a short description.
>>
>> My background: I worked as a software engineer in some of the most
>> famous AI startups on the recent market, which created less public
>> competitors to GPT and ChatGPT.
>>
>> The short of it is: remember on your smart phone, when you're typing
>> out a message, and it shows you the next word you might type above the
>> keyboard? And you can tap it? Sometimes you can keep tapping and a
>> sentence will come out? That's the core idea behind GPT.
>>
>> Basically, what GPT does - "Generative Predictive Text" is the
>> original moniker which later got rebranded to sound more mystifying -
>> is that given a start of a sentence, it finishes that sentence in the
>> most expected way.
>>
>> So let's say you start with:
>>
>> Trees are
>>
>> GPT has read every text on the planet. It has a frequency table of
>> every word that comes after "Trees are". Example continuations are:
>>
>> Trees are green ... (rank 72)
>> Trees are large ... (rank 1)
>> Trees are wooden ... (rank 15)
>>
>> It finds out the most popular word after "Trees are" and tacks it on.
>>
>> Then, it repeats it with the next one. For example, let's say the most
>> popular word was "wooden". Then the new prompt for it is:
>>
>> Trees are large
>>
>> continuations for this might be:
>>
>> Trees are large plants ... (rank 7)
>> Trees are large, green ... (rank 52999)
>> Trees are large and ... (rank 122)
>>
>> and so on.
>>
>> Now OpenAI's GPT actually takes more context than two words. It'll
>> look at the whole paragraph you put in, and figure out the next most
>> probable word to tack on to the end. But it only ever does that: it
>> goes one, word, by, one, word.
>>
>> GPT isn't smart. It doesn't know what trees are. When you ask it what
>> trees are it doesn't think to itself "hmm, what is my definition of a
>> tree, an object I know of?". For GPT, trees don't exist. It has no
>> object permanence - like a toddler. If we started a campaign, where on
>> every forum, mailing list, news website, and encyclopedia we say that
>> trees are made out of metal, GPT 5 will soon enough start telling
>> people that:
>>
>> Trees are made out of _____ (inserted most popular word: "metal").
>>
>> It's like the kid taken to the blackboard that doesn't know how to
>> answer the teacher's question: "Johnny, what is the capitol of
>> Colombia?" "It's... uh... er... uh..." (2 minutes pass) "OK, Johnny,
>> B...." "Berlin?" "Bo...." "Bo...dapest?" "Bog..." "Bog roll!"
>>
>> There's no reason to be scared of a precocious phone keyboard.
>>
>> And it isn't going anywhere, because interesting output requires
>> operating on concepts - not just doing guess-the-next-word.
>>
>> GPT has one great application: it's great for if you want to be lied
>> to. It's great on assignments like "tell me a sci fi story" or "tell
>> me about faeries". But otherwise it has the IQ of an absolute idiot.
>>
>> If you want to understand how GPT "thinks", play an online game called
>> Semantle. (google it, I don't want to put in links and end up in spam
>> folders). Once you've won a few games, you know a little bit about how
>> predictive text sees the world of words.
>>
>> On Thu, Oct 12, 2023 at 10:27 PM Kevin Walsh wrote:
>> >
>> > A quiet week so...
>> >
>> > OpenAI.com GPT4.0 has just released image analysis.
>> >
>> > I tried it for a VCO circuit (Schmitt/Inverter) and it gave a decent explanation of the circuit.
>> >
>> > I got it to write Arduino code for a MIDI controlled baby8 sequencer with nothing but prompts.
>> >
>> > I have to say I am very excited to see where this GPT thing goes but also a little frightened by it.
>> >
>> > Thoughts?
>
More information about the Synth-diy
mailing list