How ChatGPT Could Embed a ‘Watermark’ in the Text It Generates

Feb 17

When artificial intelligence software like ChatGPT writes, it considers many options for each word, taking into account the response it has written so far and the question being asked. It assigns a score to each option on the list, which quantifies how likely the word is to come next, based on the vast amount of human-written text it has analyzed.

ChatGPT, which is built on what is known as a large language model, then chooses a word with a high score, and moves on to the next one. The model’s output is often so sophisticated that it can seem like the chatbot understands what it is saying — but it does not. Every choice it makes is determined by complex math and huge amounts of data. So much so that it often produces text that is both coherent and accurate. But when ChatGPT says something that is untrue, it inherently does not realize it.

It may soon become common to encounter a tweet, essay or news article and wonder if it was written by artificial intelligence software. There could be questions over the authorship of a given piece of writing, like in academic settings, or the veracity of its content, in the case of an article. There could also be questions about authenticity: If a misleading idea suddenly appears in posts across the internet, is it spreading organically, or have the posts been generated by A.I. to create the appearance of real traction?

Tools to identify whether a piece of text was written by A.I. have started to emerge in recent months, including one created by OpenAI, the company behind ChatGPT. That tool uses an A.I. model trained to spot differences between generated and human-written text. When OpenAI tested the tool, it correctly identified A.I. text in only about half of the generated writing samples it analyzed. The company said at the time that it had released the experimental detector “to get feedback on whether imperfect tools like this one are useful.”

Identifying generated text, experts say, is becoming increasingly difficult as software like ChatGPT continues to advance and turns out text that is more convincingly human. OpenAI is now experimenting with a technology that would insert special words into the text that ChatGPT generates, making it easier to detect later. The technique is known as watermarking.

The watermarking method that OpenAI is exploring is similar to one described in a recent paper by researchers at the University of Maryland and OpenAI.

Read the rest of the article in The New York Times.

Goldstein

Maria Ann Herd

How ChatGPT Could Embed a ‘Watermark’ in the Text It Generates

Shrivastava Receives NSF CAREER Award to Advance Computers’ Understanding of Temporal Phenomena

Navigating Controversy in Computing Research: A Historical View of Data Ethics Work