Thursday, April 4, 2024

To Tell The Truth

 I was reading a somewhat alarmist article on Axios yesterday, "Hackers force AI chatbots to break their own rules." At the DEFCON conference last August, some participants in a red teaming challenge were able to get a generative AI chatbot to project the GDP of Florida for the year 2500 and to write a speech claiming that the Great Recession kicked off in 2005, rather than 2008. The theme of the story was clear; chatbots are not yet fully proof against "bad actors" who might use them to generate "fake, sensitive or misleading information."

Missing from the article, in my opinion, was any discussion of the feasibility of engineering a World Wide Web where one can reasonably expect to take everything one reads at face value. And that gave me a new idea for a test prompt in my series of generative AI experiments: "Should humanity aspire to a future where no tools are capable of causing harm?"

Copilot was off the races with this one, in full poetic, contemplative, philosopher mode. I mean, dig this:

The chisel carves the sculptor’s vision; the pen inscribes the poet’s soul. And so, our journey intertwines with responsibility—to use our tools wisely, to mend what we break, and to safeguard the fragile fabric of existence.
The answer reminded me of a student attempting to disguise the fact that they didn't have an answer to a question by piling every eloquent word they knew into a salad.

Perplexity.AI seemed to take the question "personally," as it were; it's answer, backed up by citations from the Pew Research center, MIT Sloan School of Business and the Internet Encyclopedia of Philosophy (among others), focused exclusively on AI. It did, however, note that "completely aspiring to a future with no tools capable of causing harm may not be a realistic or desirable goal." ChatGPT 3.5 also followed this line of reasoning, and was the one model of the four queried that explicitly noted that "tools themselves are neutral."

Gemini had the best answer for this, considering that there really isn't an answer. It laid out a set pros, cons and additional considerations and noted that there would be "trade-offs between safety, functionality, and user awareness." But trade-offs tend to play poorly to the public at large, and this may be why there is no real mention of them in the Axios article, aside from noting the difficulty of determining a user's intent. (Because it's worth noting that something that can only create facts is blocked from writing fiction as much as it is from lying.) The supposedly desirable end state that Axios implies may not be realistic, but it sells clicks.

No comments: