AI and trombone history: a bad match

Y
yeodoug
Posts: 56
Joined: May 10, 2018

by yeodoug »

Hello friends, and happy almost new year,

Artificial intelligence is with us and it's become a big topic of conversation throughout society. AI has many positive applications but there are significant downsides as well that are not so frequently discussed. For trombonists, how AI is used in academia is having some major implications as many individuals are using it to write dissertations, papers, and other assignments.

Note to self: The danger of counting on AI to help you with an assignment about the history of the trombone is pretty much guaranteed to cause you trouble. Yesterday, I was using ChatGPT 5.2 to help me identify a bit of 18th century German text about the trombone. I'm writing a new book and while proofreading a chapter, I thought I had my citation correct but with all the resources I'm juggling at the same time right now, wasn't sure. I asked AI to identify the passage which it did with great specificity.

What happened next is the subject of this new article on my blog, "Why trombone players cannot trust artificial intelligence (AI) for historical information":

[url]<LINK_TEXT text="https://thelasttrombone.com/2025/12/17/ ... formation/">https://thelasttrombone.com/2025/12/17/why-trombone-players-cannot-trust-artificial-intelligence-ai-for-historical-information/</LINK_TEXT>

If you're a trombone teacher or player, you might want to have a look at this. It's a cautionary tale about the dangers of putting your trust in AI to deliver accurate information. And please feel free to share the article with others via email, social media, or other platforms. If I help one student avoid a failing grade or a trip to the Dean's office — and one more person will help tell the history of the trombone correctly — the time I spent putting together the article will be well worth it!

All the best,

-Douglas Yeo
G
ghmerrill
Posts: 2193
Joined: Apr 02, 2018

by ghmerrill »

What a great and detailed example!

People need to remember that no AI (or any other source, human or otherwise) should be regarded as an infallible source -- if for no other reasons than (a) the information sources (or "expertise") being used are not themselves infallible, and (b) Everyone (including an AI) is susceptible to errors of inaccurate information and errors in reasoning (both in terms of method and application). And this is particularly true at this time when -- despite the huge leaps forward that LLMs have provided -- this technology is still in its relatively early stages.

Also, you can/should expect the quality of AIs to differ in much the same ways that the qualities of human experts differ. Bluntly, some are smarter than others, have more knowledge than others, are better trained than others, and are more reliable than others. And as Doug points out, sometimes these issues aren't a matter of "hard facts", but of nuanced or debatable interpretations or difference in emphasis of some evidence rather over other evidence.

Although it's been known for a long time now that (in the context of natural language understanding and AI) the quality/accuracy of the result is highly dependent on what "training corpus" is used, you also have to keep in mind that an AI will be making various judgments of relevance and degree of relevance in reaching its conclusions (just as we do). So results depend directly both on how well the AI thinks/reasons and its sources of information -- which is currently one reason that the more "expensive" (higher subscription costs) AIs are better and more reliable than the "free" ones that you don't pay for. But you'll never reach complete and fully reliable accuracy with an AI any more than with a human -- that's not how knowledge of the world works (as centuries of science and philosophy have illustrated).

A while ago I had a "cheap" AI (not one of the major ones like ChatGPT, Grok, etc.) refer to published work of mine in responding to one of my queries (not a query about myself in any direct way). When I pointed this out, its response -- much like one that Doug mentions -- was basically "Oop! Sorry about that. You're right." One thing that's interesting is how an AI (one of the more competent ones, at least) can then reflect on such an error and offer reasons about why it was made. How very human. :lol:

No one involved in AI research and development would ever claim that an AI is (or could be) infallible. But I do think that a lot of people (not involved in AI research and development) may think that this is the goal. It's not. It can't be.

Doug's example is a great illustration of how these "knowledge problems" should be approached -- applicable to both human experts and AIs. :good:
L
LeTromboniste
Posts: 1634
Joined: Apr 11, 2018

by LeTromboniste »

[quote="yeodoug"]Note to self: The danger of counting on AI to help you with an assignment about the history of the trombone is pretty much guaranteed to cause you trouble. Yesterday, I was using ChatGPT 5.2 to help me identify a bit of 18th century German text about the trombone. I'm writing a new book and while proofreading a chapter, I thought I had my citation correct but with all the resources I'm juggling at the same time right now, wasn't sure. I asked AI to identify the passage which it did with great specificity.[/quote]

Yup constantly facing this problem basically everytime I ask AI anything.

I was recently preparing for a talk, part of which was going through and discussing a certain historical text. Original in Italian, and only one English translation readily available, and in my opinion not very good, because the translator is not a great English speaker, and they also in my opinion misunderstood the meaning of some of the original. I've translated portions of it before. I know the text fairly well, but some sections better than others. I wanted ChatGPT to give me a full translation that I would then check and tweak as needed, just to save me the grunt work of translating all the portions that are very straightforward.

I first asked GPT if it has access to the full text, as I want to work on the original text, not some secondary source paraphrasing it. I provided the IMSLP link to the original as well as to the available translation (which is at least useful in that it has the original Italian in modern typography). It said yes.

I asked it to provide me with transcription of the original and a full translation to English.

Here it gets interesting. It churns out a very long chunk of text. It starts off with exactly what I expect to find. And then a few paragraphs in I start reading things and think "how did I miss that before!? I don't remember this being there". I check GPT's transcription of the original, it's there. Ok. I go and check the original. Can't find that language at all. I look in later editions if that's a passage that was added. Can't find it. I go back to GPT's translation and transcription and scroll down and notice that there are sections I know are in the original, because I've translated and referenced them before, that are completely missing.

So I ask GPT if this is the original material or if it took it from a secondary source. It says this is the original. I quote the first paragraph that surprised me and ask it to point out where in the original that is found. It gives me the reference for the original, where that text is nowhere to be found. I say that this original does not contain this text and again if it took it from secondary material. It now say "oh yes sorry, I misunderstood your initial query. I did not provide with the original text, but material from a secondary source. I thought this would help you". I ask it what secondary source it quoted. It says "this is not a quote of any specific source, I paraphrased and synthesized the original with secondary material"

I tell it that's not what I asked, and that is not helpful to me. I only need it to transcribe and translate the exact content of the original. It apologizes again and starts churning out text again. This time already the first paragraphs are obviously wrong, so I stop it and point out this is still not the original. It then says "I'm sorry, it doesn't look like I have access to the original text". Despite my providing the link to the original and the beginning.

At that point I realized I could have been well on my way to translating the thing myself by then, so I quit trying to get it to help me and just did it myself.
G
ghmerrill
Posts: 2193
Joined: Apr 02, 2018

by ghmerrill »

[quote="LeTromboniste"]It says "this is not a quote of any specific source, I paraphrased and synthesized the original with secondary material"[/quote]

If you do a Google search for "ChatGPT errors and fabrications in referencing literature sources" you'll get an acknowledgement of this problem in the returned "AI Overview", a description of the evidence for this fabrication and error, and some explanations of why this may be happening. There's a section on "Evidence of Fabrication & Errors" that's quite interesting concerning the scope of this problem. This is definitely an area that needs improvement.

One of the interesting features of these AIs is that they can be used to analyze their own behaviors -- again, much like humans. :lol:
H
harrisonreed
Posts: 6479
Joined: Aug 17, 2018

by harrisonreed »

I think LLMs are kind of a dead end for "intelligence". The tech is basically designed to predict the next most likely event in a sequence based on previous existing data or data it generates as it trains itself. It's an extremely complex plinko board that you drop balls into and see what happens.

One of the earlier uses of a similar tech (ie, a neural network that you train on either a data set or just give it rules and train it on itself) are the various chess engines that are essentially impossible for a human being to *ever* beat. In this use case, you give it a set of rules and then have the neutral network play against itself billions of times. The network just needs to know outcomes -- whatever sequence of connections it does in the neural network that results in a "1" at the final node/nexus (ie, "You Win"!) gets stronger. If it results in a "0" ("you lose"!), those connections aren't explored as much, and eventually disconnected from weak paths altogether. There are gradiations between "1" and "0" (0.7! You didn't lose, didn't win, but this path got us to "1" more often than not) that help strengthen the network, since it is rare that any given move will give you a "1" immediately in chess. The network and the model don't need to even know what is going on or really analyze anything, they just know, based on billions of games played and reinforced connections in their neural net, how to infer what the most probable next move would be to get them to "1". There is no data set outside the model and no real point to having one (of course, the engines can have an "opening book" they follow that could be programmed in to force them to follow set moves until the pattern is broken). There are more possible variations of games of chess that could be played out on the board than there are atoms in the universe, so storing the data about these games is silly unless you find the one variation set that "solves chess" (this may never happen). Instead you just look at the board and the neural net will just infer what the most likely next move should be based on the different paths and ruts that have been carved into it. I'm simplifying this a bit, since chess engines don't just look at the board once per move -- they will then project out and run the same process out in forks, however many hundreds of moves out they can go as their specs allow. This projection also informs the score for their move.

Magnus Carlsen says playing these engines is like playing an alien from another universe. It just makes moves with no coherent strategy and the board position gets worse and worse for you until you just lose. "The best move" might be some random passive pawn move that doesn't become apparent as the keylog of the whole game until 40 moves later (if only that pawn wasn't there!!).

Applying this strategy of building a neural network to working with language is far more complicated than chess. Chess has one context - a set of rules and and endstate. Language is open ended. You don't "win". So when LLMs are trained, they do the same kinds of training as the chess engine but use written language instead. They are fed sentence after sentence and then try to see what neural pathways they need to follow to have a similar sentence spit out the other end upon an input. The key with current LLMs, vs the chat bots of the 90s, is that they are not just looking at one word, they are running the whole sentence, the whole paragraph even, through their neural network to predict what the next best word should be. And then they do it again, and again, each word. They aren't thinking about anything at all. They just know what the next most likely word should be, given all the things they've read and trained on.

If you specifically feed an LLM all the texts related to the trombone and forced it to run all those texts through its network when predicting the next word, every word, its responses would likely be much better for research purposes.

At the end of the day, though, it is not really using logic or *thinking* -- an LLM is basically the classic peg board game plinko, where you drop a paragraph in as a set of balls, and the plinko board is already shaped through training so that it sorts the paragraph into various slots and then it knows what to spit back out. There are knobs you can turn to slightly randomize the pegs positions, or guide it to a certain way of responding. Nothing new comes out of this game.

I got pretty excited about LLMs when they were first becoming available, but it became obvious pretty quickly that they aren't really going to give insight or generate anything new. And they often lie, because they just need to predict what is the most likely word to give you next; they don't need to know or care about what they are actually saying. People being surprised by this is kind of funny to me. "Why is this plinko board lying to me?" :idk: It's not, really. It's just sorting through what you said to it and making mathematical predictions about what to say back to you.

I think that we are going to still see AGI, artificial general intelligence, developed in the near future, which will operate as not just one plinko board but multiple plinko boards that communicate with each other and shoot the ball back and forth between boards and even create their own inputs (ie, thinking and questioning their own process) that they send through concurrently before they give a response. There would be a context board, a logic board, a morality board, an emotion board, a visual reception board, an audio board ... you know, like how the different lobes of the human brain works. I think that we'll actually see that the human brain will be truly hard to beat spec wise for this kind of bouncing around and "thinking about thinking" -- it won't be a simple number crunch anymore. Perhaps quantum based processors would be up for the task, since quantum processes seem to be related to the phenomenon of consciousness. They might be able to run all these different plinko boards simultaneously, have this generate novel inputs (thoughts) to run simultaneously with the initial input, and *then* arrive at a bunch of different options which it could run through again, think about some more, and *then* choose what it wants to say.

The LLM needs to actually care about what its response is and what the topic is. Once it does, it might be interesting to use.
G
ghmerrill
Posts: 2193
Joined: Apr 02, 2018

by ghmerrill »

[quote="harrisonreed"]At the end of the day, though, it is not really using logic or *thinking* -- an LLM is basically the classic peg board game plinko, where you drop a paragraph in as a set of balls, and the plinko board is already shaped through training so that it sorts the paragraph into various slots and then it knows what to spit back out. There are knobs you can turn to slightly randomize the pegs positions, or guide it to a certain way of responding. Nothing new comes out of this game.[/quote]
You should submit this characterization to ChatGPT and ask if it's accurate, and in which ways it may be inaccurate. After an analysis and fairly detailed account of accuracies and inaccuracies, it will offer to help you refine the metaphor "into something skeptical and technically accurate."
H
harrisonreed
Posts: 6479
Joined: Aug 17, 2018

by harrisonreed » (edited 2025-12-18 5:18 p.m.)

[quote="ghmerrill"]You should submit this characterization to ChatGPT and ask if it's accurate, and in which ways it may be inaccurate. After an analysis and fairly detailed account of accuracies and inaccuracies, it will offer to help you refine the metaphor "into something skeptical and technically accurate."[/quote]

There's no need. I got the metaphor from an interview I heard with one of the designers of these LLMs.

A great use case for these would actually be as a dungeon master for a D&D game, or to create a text based adventure game on the fly. I've tried to do the later, but it is not very good. They are notorious for making stuff up but they suck at it where it actually counts.

Here is a GREAT podcast that discussed how these things are trained, and how they cross over from one discipline to another:

https://radiolab.org/podcast/the-alien-in-the-room

In the case of training it to speak, it's unbelievable. Playing chess? Crazy. Thinking like a human? Not even close. Once you can kind of get your head around the trick of how these things work, you can see that the technology is not really appropriate for what people are trying to use it for, particularly in academia. Using the same system of training a neural network to predict what *sound* comes next, to clone a voice, is a much better use case:

<YOUTUBE id="vKgo1-VFBkE">[media]https://youtu.be/vKgo1-VFBkE?si=TH8B_DpUNBxbiz1M</YOUTUBE>

It's all the same concept. In short:

1) can you predict a pattern of sounds given a sample sound to turn text in to speech?

Works way better than:

2) Can you predict what words come next based on the context of this sentence I'm feeding you, and hopefully it tells me the truth about the topic I'm trying to write an article about?
G
ghmerrill
Posts: 2193
Joined: Apr 02, 2018

by ghmerrill »

[quote="harrisonreed"]

There's no need. I got the metaphor from an interview I heard with one of the designers of these LLMs.[/quote]

Who was that? What company?
H
harrisonreed
Posts: 6479
Joined: Aug 17, 2018

by harrisonreed »

Ah, crap, I edited my post too late. More of my thoughts are above, about this topic. And below:

The "wow" factor lasted about 10 minutes for me, as far as LLMs go. I'm really surprised that academics would use it more than once or twice before they realized that it is not good for the tasks they are trying to use it for. It's a complex language (or whatever you train it on) prediction machine. So far that's it.

Actual uses that I could see such tech, specifically LLMs, being useful for:

1) practicing and learning foreign languages. LLMs are really good at conversing. As long as the responses it gives you are A) linguistically correct, B) Reasonable responses that are based on the context of the conversation, C) reasonably varied (which LLMs already use "temperature" to achieve), then you could basically have access to a language tutor 24/7 that will never get bored.

2) Finding patterns in information that seems encrypted. I could totally see something like GPT 5 being given the text on the rosetta stone and, if not completely translating it, at least finding enough patterns for humans to inuit the meaning and grammar of the heiroglyphs. Humans already did that on our own, but I bet a neural network could do it much faster. Perhaps it would be able to do it on less clues than the Rosetta stone contained.

3) repair ancient damaged texts. There are a lot of broken clay tablets that have some information, but are missing words or portions of the text. A predictive LLM seems like it would be great at guessing what is supposed to fit in the missing areas.

4) repair mostly complete musical works that are missing a page or missing some parts. Language and music seems related. I've already heard unfinished symphonies "finished" by AI (it's terrible), but given a piece that is 80% complete with holes in it, rather than missing movements, I think an AI trained on a composer's work could do a reasonable job repairing the score. This one would be interesting to train -- I imagine that you'd take actual complete works from a composer and damage them, and then have the AI train itself to repair the scores (it would compare what it got against the real score, grade itself, and then forget the real score and try again until it is able to predict the correct missing parts 99% of the time regardless of where you break the scores you give it, or whatever).

None of these tasks need an AI to actually know anything about a topic, they just need to do their job of predicting what word or note comes next given a training context.

Asking a predictive LLM to check sources or infer anything logical, especially if you are not being very selective with what it was trained on or holds in its context, is a lot more complex of a task than simply asking it to predict what sentence might come next, which is what it is actually doing.
G
ghmerrill
Posts: 2193
Joined: Apr 02, 2018

by ghmerrill »

[quote="harrisonreed"]

There's no need. I got the metaphor from an interview I heard with one of the designers of these LLMs.[/quote]
I'm still trying to figure out what this reference is to. Is it that RadioLab site page about teaching the GO game, or is it the "Chatterbox: TTS With an Intensity Slider?" YouTube video by the guy who says he trains AI voice models as a hobby? Or is it something else?

Who's the designer and which LLM is he involved in designing?
L
LeTromboniste
Posts: 1634
Joined: Apr 11, 2018

by LeTromboniste »

[quote="harrisonreed"]

The "wow" factor lasted about 10 minutes for me, as far as LLMs go. I'm really surprised that academics would use it more than once or twice before they realized that it is not good for the tasks they are trying to use it for.
[/quote]

That's basically my experience. Tried to use it a few times to see what it could help me with, and the answer is, not much. Basically, useful for generating translations that are better than Google Translate and marginally better than Deepl (and after my experience described above, only for translations of text where I give it the exact original text), and useful for generating the skeleton of programme notes and grant applications. And for that second use, I try to avoid it, for ethical and environmental reasons, and also because I can write better than it can. I've used it a couple times for quick programme notes drafts when in a rush, needing a couple of paragraphs quickly and with no time to come up with something good, just getting it to spit out something half-decent and then I'd completely rewrite it because its language is too obviously AI-ish (lots of buzzwords used in ways that don't actually make that much sense), but it at least gives a structure and gets you started faster.

But with grant applications especially, it annoys me that it does such a decent job. It can write better than most people and raises the average level of applications, which makes it harder for someone who can actually write well and clearly articulate their artistic vision to stand out.
H
harrisonreed
Posts: 6479
Joined: Aug 17, 2018

by harrisonreed »

Maximilien, I have also found a good use for it in the military for specific writing tasks. We have online forms that we have to use for writing awards and for evaluating annual job performance of subordinates. The form fields have very strict size, spacing, and formatting restrictions that make it difficult to capture everything you want to write about what someone did. In the past I would write out a long version of what someone did, performance wise, and then it would take quite a while to pare it down to the exact character count to get it to fit into the measley two lines of text that you can allocate to a performance bullet and still capture everything you need to say.

Chat GPT is *really* good at taking a block of text that has all the info you need to say and distilling it down to a bullet point. "Please take this long sentence and rewrite it as a 60 character NCOER / military style performance bullet". That saves me a ton of time having to sift through a thesaurus trying to tetris words into a form.

But that is still different from asking it to use logic or infer anything about a topic. And I still have to edit the output it gives me. It winds up being a really fancy thesaurus. I did try using it as an assistant and partner as I was teaching myself how to operate my lathe. I would ask it if what I was trying to do was a good idea, safe, dangerous, etc. I can report that LLMs are a very bad idea to use to teach yourself to be a machinist (I was not expecting it to work, fwiw). It constantly would give me dangerous suggestions, screw up math, screw up NC code, and basically was trying to kill me.

At one point I replied to a strategy it gave me:

Me: "Are you sure that's safe? I think that feeding the tool at that rate will cause a crash."

GPT: "Of course! I apologize. You are absolutely right to question my suggestions here, and that feedrate would indeed cause your machine to crash catastrophically. I am a large language model, and as such ... [Blah blah blah]"

Me: "Well why would you suggest that in the first place? That is so dangerous! What were you thinking?"

GPT: "As a large language model, I do not "think" in the traditional sense .... [Blah blah blah]"

Me: " 'kay byeeeee!!"
Y
yeodoug
Posts: 56
Joined: May 10, 2018

by yeodoug »

All of this information about how AI works is really interesting and important.

For us, on a very practical level, AI is poisoning the research community and this is where students have to have their antennae up. Big time.

My example of ChatGPT giving a false citation for a 19th century trombone treatise - and being very dogmatic that is was the correct citation even though I demonstrated it was wrong - is just the tip of the iceberg.

Have a look at this article from Rollingstone, December 17, 2025:

[url]<LINK_TEXT text="https://www.rollingstone.com/culture/cu ... 235485484/">https://www.rollingstone.com/culture/culture-features/ai-chatbot-journal-research-fake-citations-1235485484/</LINK_TEXT>

The article's title is:

AI is Inventing Academic Papers That Don't Exist - and They're Being Cited in Real Journals: The proliferation of references to fake articles threatens to undermine the legitimacy of institutional research across the board.

Take a few minutes to read the article. The problem is very real. Students may think they're getting help from AI when they write papers. Sloppy professors won't check sources. AI will continue to feed upon itself, spewing out fake citations.

The trombone has been dealing with the problem for centuries. Back in 2017, I wrote a short article for my blog on the subject of trombone "fake news." I highlighted a few examples of trombone history myths that have been perpetuated for centuries:

[url]<LINK_TEXT text="https://thelasttrombone.com/2017/01/14/ ... -trombone/">https://thelasttrombone.com/2017/01/14/fake-news-the-trombone/</LINK_TEXT>

That article is just the tip of a very large iceberg. The book I'm working on now has a long segment on the problem. The problem is that even though certain "facts" have been proven to be demonstrably wrong, they keep getting repeated. Stay tuned for Part 2 of my article about Ravel's Bolero that is coming out in the January ITA Journal. Part of my article is an examination of the claims of trombonist/composer Leo Arnaud, who claimed to have had a four year friendship with Ravel during which time he taught Ravel about jazz style, told him how to write the Bolero trombone solo, and played the premiere of Bolero. Those claims have been repeated in hundreds and hundreds of articles and books and have been accepted as articles of faith by the trombone community. But when you read my article, you'll see what happened when I submitted Arnaud's claim to the scrutiny of the historical record.

AI isn't the first thing to make stuff up. People have been doing it since the beginning of time. But we are in a new, more dangerous era. We all got excited in the early 1990s when the Internet came out and we could begin to fact check historical events because of the huge number of databases that scanned historical newspapers, books, and images. But now, we know that not only are such resources being manipulated (deep fake videos are a good example), but AI is making up stuff - citations, photos, even whole articles - that sound and look plausible and are passing for real. Most people don't exercise the needed filter to recognize the fake. This is just the logical extension from the scammer who sends you an email that looks like it's from your bank, telling you that you need to reset your password since your account has been frozen. College professors take information technology training several times a year (along with harassment, and Title IX, and firer extinguisher, and cash management, and a host of other subjects) to help them identify fake phishing emails and such.

The college student under a deadline may look at AI as the savior to prevent an all-nighter. But in the process, the student may fall victim to AI's hallucinations. Then, the student is just the latest cog in the wheel of the degradation of history.

This isn't "the little boy crying wolf." The problem is real. Spread the word friends: we need to pay attention to this.

-Douglas Yeo
H
harrisonreed
Posts: 6479
Joined: Aug 17, 2018

by harrisonreed »

Doug, you might have already mentioned, and I'm sure the Rolling Stones article is all over it (haven't read yet), but LLMs are now also referring to and being trained on AI generated sludge, as well. So it is actually getting worse and worse at giving factual information. It's like a negative feedback loop. If trained on it, it will cite that fake academic article you mentioned being written by AI, and will assume that fake article is gospel truth. Now you've got a double fake output.

On a side note, I took a class recently that explained how to catch AI written sentences, and one of the telltale signs is the use of the "M dash" to connect related sentences or phrases. I kind of hate this, because I was taught to use this technique in middle school and it is everywhere in my writing -- I must be an AI.
G
ghmerrill
Posts: 2193
Joined: Apr 02, 2018

by ghmerrill »

[quote="yeodoug"]

AI is Inventing Academic Papers That Don't Exist - and They're Being Cited in Real Journals: The proliferation of references to fake articles threatens to undermine the legitimacy of institutional research across the board.
[/quote]

While the fabrication by an AI (or anyone else) is a problem with the AI, the use of that fabrication by an academic researcher is the fault of the researcher. We shouldn't blame the AI for either the laziness or lack of integrity of a "researcher" in throwing a bogus reference into his own work, and then the laziness or lack of integrity of referees and editors in passing it through for inclusion. They're the ones undermining the legitimacy of institutional research. Really, you didn't used to be able to get away with that kind of pseudo-intellectual crap. No one should ever include a bogus citation in a list of references, and no referee or editor should ever let that get through the submission or editing phases.

Not to mention that a solution to such moral depravity is to employ editorial AIs to check the validity and relevance of citations -- but then I suppose people will worry about a conspiracy of AIs, perhaps dedicated to taking over academia. :)
Y
yeodoug
Posts: 56
Joined: May 10, 2018

by yeodoug »

I agree: It's too bad that sloppy AI police types have flagged the Em dash as code for AI generated content. That's nonsense. The Em dash has been around for a LONG time. I've used it for the last 40 years—just look at my articles and books (and, see, I just used one in this sentence). Chicago Manual of Style 6.91talks about the EM dash in a highly positive way (this is from the current version, 18, published in 2024), as "the most commonly used and the most versatile of the dashes":

User image

People who use the Em dash as an AI flag are lazy.

As to AI feeding itself, the New York Times had an insightful article about this, published August 25, 2024:

"When A.I.'s Output is a Threat to A.I. Itself."

Here's a link to the article. I know the NY Times has a paywall but maybe you can read it:

[url]<LINK_TEXT text="https://www.nytimes.com/interactive/202 ... -data.html">https://www.nytimes.com/interactive/2024/08/26/upshot/ai-synthetic-data.html</LINK_TEXT>

The article is interactive and shows what AI does when it feeds on itself. We're seeing this with images and with information. It's a problem. A big problem. This isn't the little boy who cried wolf or chicken little running around. Educators and students that don't realize this threat are contributing to an even bigger problem: the loss of accurate information. Sigh. . .

-Douglas Yeo
Y
yeodoug
Posts: 56
Joined: May 10, 2018

by yeodoug »

[quote="ghmerrill"]

While the fabrication by an AI (or anyone else) is a problem with the AI, the use of that fabrication by an academic researcher is the fault of the researcher. We shouldn't blame the AI for either the laziness or lack of integrity of a "researcher" in throwing a bogus reference into his own work, and then the laziness or lack of integrity of referees and editors in passing it through for inclusion. They're the ones undermining the legitimacy of institutional research. Really, you didn't used to be able to get away with that kind of pseudo-intellectual crap. No one should ever include a bogus citation in a list of references, and no referee or editor should ever let that get through the submission or editing phases.[/quote]

Absolutely. AI is a tool. If an individual wants to use the tool, the individual lives with the consequences of using the tool. Use a hammer, you might hit your thumb. Use a typewriter, you might misspell a word. Use AI, you might get a bogus citation. The responsibility lies with the user.

But this is what many people don't understand: If an individual uses AI to write a paper and the teacher checks sources and discovers that the paper is full of AI hallucinations and false citations—and then the student gets an "F"—the responsibility is on the student, not AI. Don't blame shift. If you can't filter information, don't use it.

-Douglas Yeo
G
ghmerrill
Posts: 2193
Joined: Apr 02, 2018

by ghmerrill »

[quote="yeodoug"]But this is what many people don't understand: If an individual uses AI to write a paper and the teacher checks sources and discovers that the paper is full of AI hallucinations and false citations—and then the student gets an "F"—the responsibility is on the student, not AI. Don't blame shift. If you can't filter information, don't use it.[/quote]
Just another variation on the plagiarism problem (which, by the way, is MUCH easier to be sure about with the help of an AI).

Don't blame shift. If you can't filter information, don't use it.

Agreed -- that's a kind of sin of omission. But I think in the case of these bogus references there's something deeper going on -- which is a positive decision to attempt to add a kind of "quality" to your work (e.g., thoroughness of research in relevant publications). So the student (or even faculty or other professional) doesn't bother to be careful about what's being given back by the AI -- because that doesn't matter. What matters isn't the quality and relevance of the citations, but just their sheer numbers. The author just doesn't care about what's in the reference list as long as there are a lot of citations to demonstrate the author's high degree of "scholarship". I've seen way too much of that in refereeing various submissions for journals in the biomedical domain. And I've seen it in references to my own work -- where the reference made no sense at all. But I guess I get to count it in my own citation count -- which I think would have been important if I'd been an academic and vying for tenure. Metrics matter. :roll:
L
LeTromboniste
Posts: 1634
Joined: Apr 11, 2018

by LeTromboniste » (edited 2025-12-20 3:37 p.m.)

[quote="ghmerrill"]One of the interesting features of these AIs is that they can be used to analyze their own behaviors -- again, much like humans. :lol:[/quote]

I don't think so, in this case. ChatGPT is only appearing to be analyzing its own behaviour, and not actually doing it beyond a very superficial level. When Doug berated ChatGPT and told it it was wrong, and it apologized and "analyzed" its own work, what it was doing was actually exactly the same thing it was doing when it returned the wrong sources in the first place. It just did what it's programmed to do, it analysed not itself, but the prompt that Doug gave it, and generated language that would sound credible in the context of the prompt it was given. Nothing less and nothing more than that. If ChatGPT analyzes and corrects its own behaviour, it can only be in terms of its ability to generate credible language because that's what its sole purpose is, and not in terms of its ability to generate credible content.

There is AI whose purpose is to analyze other AI and, for example, flag if some content was AI-generated. That is still not AI that analyses its own behaviour, it's AI analyzing other AI's behaviour, which it was specifically taught to do.

Edits: spelling and syntax
H
harrisonreed
Posts: 6479
Joined: Aug 17, 2018

by harrisonreed »

Yes, I think once the LLM is just *one* neural network in a whole ecosystem network of interconnected neural networks (logic, optical, aural, moral, emotional, and a mechanism that causes these to fire back and forth and interact with each other upon an external input, and even generate its own input (having its own thoughts...)) to send through itself, *then* it will be able to self analyze.

I'm both excited for and terrified for the day such a network of networks hands us solutions to 'impossible' problems. I want to see us ask something like that for the blueprints for a faster-than-light drive and it actually produces blueprints that the we can build but human minds can't actually comprehend.
K
Kbiggs
Posts: 1768
Joined: Mar 24, 2018

by Kbiggs »

There’s an acronym from the early days of computing that applies to this situation: GIGO. Garbage in, garbage out. A program—a report, a paper, a book—is only as good as the information and effort used to construct it.

It seems as though AI is, in many different ways, being touted as a solution to a problem. As far as I can see, though, it is only—and will remain only—a tool to help address problems. The way some people think it should be used seems to be something like, “Look, I bought all these yard tools! Here’s a shovel, a rake, and a hoe! Now the garden can take care of itself!”
G
ghmerrill
Posts: 2193
Joined: Apr 02, 2018

by ghmerrill »

There are a number of perspectives from which AI can be approached, and so people have different perceptions and understandings of it -- which can be based on their own direct knowledge and use of it, or what others say about it. Artificial intelligence is a very broad subject (much, much broader, and much more complex than it was, say, 20 or 30 years ago) and now includes significant advances in such areas as perception and motion (think of self-driving cars or those military robots and doggies :) ). And much of contemporary AI has had substantial influence from such areas as cognitive science and brain science. It's not just "the algorithm" or, more generally, algorithms. In fact quite a lot of it isn't algorithmic at all, and most people reading this sentence probably don't know what an algorithm really is anyway. Certainly the vast majority of news reporters haven't a clue.

I'm no expert in what I think of as contemporary AI -- which involves, among other things, large language models. I think that I grasp the fundamental approach and at least the major strengths and weaknesses -- but from outside the design/development/deployment/application community. And I do get feedback in various ways from people currently working on and with AI, and whom I stay in touch with. If you'll pardon some self-indulgence, here's my own current perspective on AI from a personal history perspective over about 30 years, and how I approach evaluating what I see nowadays in terms of AI theory and applications. Of course, just bail out on this if it's uninteresting or tedious. :roll: But you may find some of the sketch of AI history that's in it to be interesting (keeping in mind that the last time I actually stood in a classroom and taught a course in AI was about 45 years ago. :shock: ), although I've certainly given a number of presentations and talks on it since then.

But before you bail on me, let me suggest you take a look at "Getting from Generative AI to Trustworthy AI: What LLMs might learn from Cyc" ([url]https://arxiv.org/pdf/2308.04445) -- written by someone who was an old friend and colleague, and one of the major contributors (in both theory and practice) to AI into the 21st century, and with whom I also had some serious disagreements about AI in various respects. Some of the things that Doug says in this article (I'm not sure if it was ever published) are compatible with things expressed in this thread as well.

----------

I first got involved in AI in 1981 when I taught an upper-level undergraduate course using Patrick Winston’s book Artificial Intelligence. A year or so later I left my safe tenure at the university and devoted fifteen years (a small consulting firm, Bell Laboratories, and SAS Institute) to becoming a serious software engineer and formal language and compiler expert – few publications, but lots of work and applications. I got tired of that, and a colleague (a mathematician turned software engineer) suggested that I should take a look at Cycorp ([url]https://cyc.com/).

Cycorp (in Austin) didn’t at that time support any remote staff, but said a client of theirs was looking for someone in the RTP area where I was living. In 1997, after a two month vetting process, I began a job at Glaxo-Wellcome – first on the IT side in Research IT, and then on the science side in Exploratory Data Sciences, working primarily in the area of text mining and “intelligent information retrieval” in support of drug discovery and drug safety. During that period I was focusing in part on natural language systems for turning things like medical reports and doctors’ notes into usable data in an accurate way. The approach at that point was still using hand-written language analyzers such as Brill taggers and Abney (cascaded finite state automata) natural language parsers. The technology was advancing, but still cumbersome.

For three years I was heavily involved in working with Doug Lenat at Cycorp (GW was leasing the Cyc system, as was Pfizer) in attempting to apply Cyc to problems in drug discovery, bioinformatics and “intelligent search” engines for their scientists and librarians. Cyc is a “knowledge-based system” (KBS) with a heavy-duty inference engine – and then to apply its inferential capabilities you need to create a very large database in a particular domain (the result is often referred to as a “micro-theory”). This is very labor intensive, somewhat error-prone, and expensive both to develop and to test. One of the prototypes I did in the area of protein science was encouraging, but it was still pretty cumbersome and the company didn’t want to pursue it. Apparently the Pfizer project went the same way, although we definitely weren’t sharing information. (Aside: I think that one of the approaches that might be successful would be to use an LLM Ai to generate a KBS (avoiding the human labor-intensive and error-prone approach) -- which might then serve in part as both an enhancement and a check on the LLM AI. I think that Doug was suggesting this approach as a "hybrid".)

In 2000, Glaxo Wellcome merged with Smith-Kline Beechum and I moved from Research IT (in the IT organization) to Data Exploration Sciences in the Research (scientific) organization within GSK – but only momentarily. At that point the European head of bioinformatics at GW moved to Novartis and I got an invitation to join the Novartis Artificial Intelligence group (Basel) as Associate Director of Artificial Intelligence. While I learned a lot more about language processing and AI from some colleagues in that group (who had left the Paris IBM Natural Language Processing group to join it), and completed a major intelligent information retrieval application based on AI/intelligent search technology, I didn’t care for the group’s management and after a year returned to GlaxoSmithKline in the Biomedical Data Sciences division – and created the Semantic Technologies Group.

Skip forward several years, through more work with various AI-related products and technologies based primarily on the use of formal ontologies, and I formed a small (2 other guys and me) group to develop an AI and large data analysis approach to early detection of adverse drug reactions. I called the project “SafetyWorks”. We ended up doing all our own IT infrastructure and support (funded by the Drug Safety organization), and the project was hugely successful. This was then picked up by the OMOP (Observational Medical Outcomes Partnership) industry/academic/government consortium, and the work our group had done was also given away (including the hundreds of thousands of lines of code) to an external company to “productize” and lease back to GSK. I wanted nothing to do with any of that, and retired in 2007. The (then) junior member of the team ultimately later moved to Johnson & Johnson and became the principal investigator for OMOP.

I did get a nice lucite award trophy that says “SAEfetyWorks Innovation Award, 2008” for “outstanding contributions to drug safety science” – misspelling the name of the project. I also was offered a job as a director of Epidemiology at Merck, but declined that for several reasons, decided to retire early, and didn’t look back. (Well, I did a couple more publications after that, but more of a foundational/theoretical nature.)

So that’s the background and perspective from which I approach an understanding and evaluation of AI.

I’m really out of the game now, but I do keep in touch, read drafts of papers when asked, and offer encouragement and mostly harmless advice (when asked). To really keep up with it would require a great deal of energy and effort that I’m not willing to devote. It’s really hard stuff, and I got tired of the really hard stuff. Plus, I'm really old and don't do the really hard stuff so well any more. At this point, AI is kind of like astrophysics. Anyone can grasp small chunks of it and speculate about other parts of it. But to really make sense of it takes a great deal of study and effort, and short of that you have to depend on other people's dumbed down reports, explanations, and analyses. And figuring out which of those are in fact dependable is just another hard problem. :|
J
jonathanharker
Posts: 139
Joined: Aug 14, 2022

by jonathanharker » (edited 2026-01-14 8:22 a.m.)

I know it's far too late now, but if we all could just stop calling this stuff "AI" and call it "highly attuned auto-predict" or less charitably "stochastic parrot" or perhaps "textual diarrhœa" instead, then we'd have a much better handle on what LLMs actually are. All it does is spit out sequences of words that it has determined are the most likely ones to follow your prompt, given the training data.[1] The "trick" is simply in the sheer volume of training data, available only to companies who have already indexed the entire corpus of text from the internet, scanned books, JSTOR, etc. regardless it seems of being particularly bothered about permission or copyright status, but that's a whole other trash fire.

Notes:

1. Generating authoritative, plausible-sounding sentences in response to a query, regardless of truth, is the very definition of bullshit. If anyone is in any doubt, read Harry Frankfurt's On Bullshit (2005) to clear up definitions. The world has enough bullshit as it is, without getting computers to generate even more of it.
G
ghmerrill
Posts: 2193
Joined: Apr 02, 2018

by ghmerrill »

Frankfurt's article on bullshit is indeed a true classic. An important part of it regards arguments from ignorance, in which he observes that

Bullshit is unavoidable whenever circumstances require someone to talk without knowing what he is talking about. Thus the production of bullshit is stimulated whenever a person’s obligations or opportunities to speak about some topic are more excessive than his knowledge of the facts that are relevant to that topic.
H
harrisonreed
Posts: 6479
Joined: Aug 17, 2018

by harrisonreed »

I do think that AGI, artificial general intelligence, will likely exist within the next ten years or so, albeit likely only available to nation state level organizations, military, and research institutions. That will be when Pandora's box is opened.
G
ghmerrill
Posts: 2193
Joined: Apr 02, 2018

by ghmerrill »

[quote="harrisonreed"]That will be when Pandora's box is opened.[/quote]
I think that Pandora's box is already opened to some degree -- and several high-visibility innovators and developers in the field have been explicit (for years) about potential consequences. The trick is to continue development of the increasing array of substantial benefits we're already seeing -- though some of these remain virtually unknown to most people (who are personally familiar only with the "chatbot" sorts of applications that now abound) -- while imposing constraints that at least inhibit the Pandora's box scenario. The recent shock and surprise has come from the speed with which the technical capabilities have leapt forward while there was no parallel program of thought on "ring fencing" the technology and applications against the Pandora scenarios.

It's interesting to see two types of fearful (or perhaps "anxious" is a better term for it) responses to the recent proliferation of AI applications and the use of AI from simple chatbots, to autonomous vehicles,to diagnosis in medicine, to actual scientific discovery in some fields, to self-aware weapons systems that can set their own goals and form their own plans.

The first is the fear of being overwhelmed by malevolent or self-aware/self-serving/self-protective AIs with no "moral sense" -- which makes sense only if you concede the genuine intelligence and capabilities of such systems.

The second is the view that AIs aren't really intelligent or capable of "thought" or "goal-oriented" behavior at all -- that they only summarize what is made available to them, depending on how they're "trained", and have no ability of "original" thought and no "genuine" intelligence -- and, a frequent addendum to this (either explicitly stated or strongly implied), that they never can achieve this such a state.

If I have to make a choice between these, I'm inclined towards the first. But there are nuances to be had in stating those problems and looking for solutions to them.
J
jonathanharker
Posts: 139
Joined: Aug 14, 2022

by jonathanharker »

[quote="ghmerrill"]Frankfurt's article on bullshit is indeed a true classic. An important part of it regards arguments from ignorance, in which he observes that
<QUOTE>Bullshit is unavoidable whenever circumstances require someone to talk without knowing what he is talking about. Thus the production of bullshit is stimulated whenever a person’s obligations or opportunities to speak about some topic are more excessive than his knowledge of the facts that are relevant to that topic.[/quote]
</QUOTE>

... I'll get my coat :)