Sunday Reads #178: One chatbot to rule them all.
Tripping up the chatbots with a deceptively innocuous question.
Hey there!
Couldn’t send the newsletter out last week, as I was helping out at a strength competition at my gym. (I didn’t participate 😁).
This week, I decided to test all the AI chatbots out there, to see which is the best one.
I used the “Apple Test” - a seemingly simple question that nevertheless tripped up (almost) all the chatbots.
[PS. I’ve been doing a course on using ChatGPT for coding. Will share more about that in the coming weeks.]
1. The Chatbot Wars: Which is the best chatbot?
So many companies have launched chatbots in the last few months weeks. It's so easy to get overwhelmed!
And the threadbois on Twitter aren't helping at all.
"10 ways to use ChatGPT to 1000x your productivity".
"Google has just launched Bard, and OpenAI is in so much trouble. 15 reasons why. 🧵".
"Bing is 2x better than ChatGPT Plus. And it's Free! 12 ways to use it to do all your work 👇👇👇".
Enough.
Instead of writing a long thread that no one wants to read (and I don't want to write!), I decided to do something different:
I ran the "Apple Test" on all the major chatbots (inspired by Ethan Mollick),
It's a simple innocuous question, that surprisingly trips up almost all the bots:
Give me 10 sentences that end with the word apple.
Sounds easy? Here's how the chatbots do.
ChatGPT Plus (runs on GPT-4)
ChatGPT Plus (USD 20 / month) does pretty well. It gets 9 / 10 sentences correct. It also uses different meanings of the word "Apple" - see #10 for example.
ChatGPT Free (runs on GPT-3.5)
The free version of ChatGPT does quite badly on this. I was shocked at how bad it was. Only two sentences ended in the word "apple".
FAIL. 👎
Bing AI (free, supposedly GPT-4)
Bing AI (now in free open preview) is supposed to run on a state-of-the-art instance of GPT-4. So I expected it to be as good as ChatGPT.
Was it? NO. Only 3 out of 10 were correct🤦♂️.
And that was in the "More Creative" mode. If you choose Balanced, the result is even worse. First it searches the Internet (no idea why). And then it proudly gives you 10 sentences that DON'T end with the word Apple.
DOUBLE FAIL.
Google Bard (free)
Google has just released its Bard chatbot on open preview. You can access it at
https://bard.google.com/.
If you believe the threaders on Twitter, it's the ultimate ChatGPT killer. How does it do?
ABYSMAL. 0 / 10.
Bard has a nifty feature - "View other drafts". See the top right. So, if you don't like the first answer it gives you, you can see alternate versions. Sounds like a great idea!
Well, I took a look, and these were their scores:
Draft 2: 0 / 10
Draft 3: 0 / 10
Well, guess that's how the Bard bungles.
Claude+ by Anthropic (free, limited use)
Another one of the supposed ChatGPT killers. Let's see how Anthropic's most powerful model does (available on Poe from Quora).
Only 2 / 10 - worse than Bing. FAIL.
(Plus, I'm not sure what "I bobbed for apples at the Halloween party" even means).
Final Results:
The winner is: CHATGPT PLUS! 🏆
[PS. This was mainly a test of verbal creativity. It should extend to brainstorming and problem-solving. I shared more resources for other tasks (image generation, coding, etc.) a couple of weeks ago, in All things AI. Check it out if you haven't already.]
Before we continue, a quick note:
Did a friend forward you this email?
Hi, I’m Jitha. Every Sunday I share ONE key learning from my work in business development and with startups; and ONE (or more) golden nuggets. Subscribe (if you haven’t) and join nearly 1,500 others who read my newsletter every week (its free!) 👇
2. Chart of the week - IQ vs. Income.
Saw this chart a few days ago, and it's quite interesting.
My reflections: If you feel that you're smarter than the folks richer than you, then one of two things is true:
You're wrong.
You're in the 90th percentile of wages. Congratulations!
Tyler Cowen says something similar, in The link between IQ and income is overrated:
The evidence is striking. One study of CEOs of large Swedish companies found that on average they ranked at the 83rd percentile of measured IQ (for CEOs of smaller companies, the rank was the 66th percentile). That’s above average, but it’s hardly a cluster at the top of the distribution. Many CEOs undoubtedly achieved their position through hard work, charisma, people skills and other abilities, not to mention luck.
In the broader distribution, the connection between IQ and income is also positive but underwhelming. One study concluded that moving from the 25th to the 75th percentile of IQ correlates with a 10% to 16% boost in earnings. That may feel significant when you get it, but it doesn’t push you into a whole new socioeconomic class.
...
One recent study, also based on Swedish data, showed two results of significance. First, much of the intelligence-earnings correlation weakens significantly and plateaus above salaries of 60,000 euros a year. Second, and perhaps more surprising, people in the top 1% of earners had *lower* IQs than the earners immediately beneath them.
Why that is the case, it’s hard to say. But one possibility is that the very smartest people prefer a more balanced life rather than working all the time. Or perhaps they prefer occupations with higher status and somewhat lower pay. Money isn’t the only thing you can enjoy. Maybe having a lot of it can make it harder to trust potential friends or spouses.
Moral of the story:
Being in the right place at the right time may be FAR more important than being the right person for the job. Act accordingly.
In the words of author Max Gunther, go where the fast flow is.
3. Golden Nugget of the week.
I saw this 2x2 on decision-making from Stripe CEO Patrick Collison, and immediately bookmarked it.
That’s it for this week. Hope you enjoyed it.
As always, stay safe, healthy and sane, wherever you are.
I’ll see you next week.
Jitha
[A quick request - if you liked today’s newsletter, I’d appreciate it very much if you could forward it to one other person who might find it useful 🙏].