What LLMs Can & Can't Do // BrXnd Dispatch vol. 014
On ChatGPT, the internet, and it's lack of connection to it.
You’re getting this email as a subscriber to the BrXnd Dispatch, a bi-weekly email at the intersection of brands and AI. On May 16th, we’re holding the BrXnd Marketing X AI Conference. Early bird tickets are 20% off until 4/1. If you’d like to learn more or sponsor the event, please be in touch.
I’m in Australia this week attending Canva’s big Create event, so I’m a bit all over the place, but I wanted to get a few thoughts shared and keep up the weekly cadence.
Let’s start with this:
Rafat is very smart and knows a lot about the internet and technology. He corrected this Tweet a bit after, acknowledging that ChatGPT was hallucinating, not reading the text of a page, but the fact this is confusing for people as internet-savvy as Rafat is a big issue.
What’s happening here, as programmer Simon Willison excellently outlines in ChatGPT, can’t access the internet, even though it really looks like it can is that the model is simply generating a completion based on the context clues in the URL. To prove this, he offers a few fun prompts that lead to no story but look like an actual URL. I took one, “Summarize this story: https://www.nytimes.com/2021/03/10/business/angry-fans-demand-nickelback-refunds.html” and adapted it slightly to show the absurdity.
Prompt 1: Summarize this story: https://www.nytimes.com/2021/03/10/business/angry-german-fans-demand-nickelback-refunds.html
Title: Angry German Fans Demand Nickelback Refunds
Date: March 10, 2021
In Germany, Nickelback fans were left frustrated and angry after attending a concert where the band performed for only 30 minutes before walking off stage. The band claimed technical difficulties as the reason for the abrupt end, but many fans believed they were intoxicated. Disappointed concertgoers demanded refunds for their tickets, with some even accusing the band of fraud. This incident has prompted criticism of the band and their professionalism, further tarnishing their already divisive reputation.
Prompt 2: Summarize this story: https://www.nytimes.com/2021/03/10/business/angry-antarctica-fans-demand-nickelback-refunds.html
Title: Angry Antarctica Fans Demand Nickelback Refunds
In March 2021, fans in Antarctica demanded refunds for a concert featuring the Canadian rock band Nickelback. The band was scheduled to perform at McMurdo Station, the largest scientific outpost on the continent. However, due to an equipment malfunction, the concert was cut short after just three songs. Attendees were upset and demanded refunds for their tickets, arguing that they did not get their money's worth. The situation highlighted the challenges of organizing and executing live events in the harsh conditions of Antarctica.
I still can’t believe the epic Nickelback Antarctica show never went ahead.
In all seriousness, the problem here is a misunderstanding of how these things work. My suspicion is that this misunderstanding has been exasperated by the growth of the chat interface and the rush to incorporate these things in search. People believe these models are databases where you ask questions and look up answers. But they’re not. The explanation I’ve been using for folks lately is that while these models are great at writing code, you should never ask them for the proper dosage of an over-the-counter drug. The latter has a single, FDA-approved answer. And while the model would almost certainly offer you the correct dosage, it’s just making it up. If you’re asking it about Tylenol, it’s almost definitely going to be accurate, and OpenAI, at least, has put some safeguards in place around health information. However, it’s still just probabilistically writing an answer.
Clearly, we will need some serious literacy around this stuff.
A big thank you to LinkedIn, Redscout, Nova, EZ Newswire, Horizon Big, Otherward, and Persistent Productions for sponsoring the upcoming BrXnd conference and my work. I have various sponsor levels available for the event if you want to support us. Be in touch (or reply to this newsletter), and I’m happy to send over the details.
New BrXndscape Companies
New companies listed on BrXndscape, a landscape of marketing AI companies (writeup in case you missed it). If I missed anything, feel free to reply or add a company. (The companies are hand-picked, but the descriptions are AI-generated—part of an automated pipeline that grabs pricing, features, and use cases from each company’s website and one of many experiments I’ve got running at the moment.)
[Synthetic Data] Hazy: Hazy is a proven partner in delivering innovative synthetic data technology in the most complex enterprise environments. It offers features such as reducing risk, streamlining transformation, and is built for the enterprise.
I think that’s about it for this week. Get your conference tickets! It’s shaping up to be great. I also have several sponsor spots available, and I would hugely appreciate you sharing the opportunity with folks who might be interested (reply to this email or get in touch, and I’ll share the details). I’m paying for this whole thing out of my pocket and am just trying not to lose any money on this first one.
Thanks, and join us on Discord!
Have noticed it seems like they’ve been changing behavior with GPT3.5 and GPT4 where they explicitly state they can’t access websites or APIs vs GPT3