(Credit: JHVEPhoto/Getty Images)
Google has made its watermarking technology for AI-generated text, called SynthID Text, generally available through its updated Responsible Generative AI Toolkit and Hugging Face, a repository of open-source AI tools.
Developers can now use SynthID Text to determine whether text has come from their own large language models with the goal of making it easier to build AI responsibly, said Pushmeet Kohli, Google DeepMind's vice president of research.
SynthID detects AI text by observing a series of words. LLMs use tokens to process information and generate output. These tokens can be a single character, word, or phrase, and LLMs can predict which token will most likely follow another, one at a time.
The tool will assign each token a score based on its probability of appearing in the output for a prompt. It will also "embed imperceptible watermarks" directly into the text during token distribution. When a text output is verified, SynthID compares the expected pattern of scores for watermarked and unmarked text and determines whether an AI tool generated the text or whether it came from another source.
It does have limitations, though. The tech requires at least three sentences for detection, and its robustness and accuracy increase with the longer text. It's also less effective on factual text and AI-generated text that's been thoroughly rewritten or translated.
"SynthID Text is not designed to directly stop motivated adversaries from causing harm," Google says. "However, it can make it harder to use AI-generated content for malicious purposes, and it can be combined with other approaches to give better coverage across content types and platforms."
SynthID Text is part of a larger family of tools Google has created to detect AI-generated outputs. Last year, the company released a similar tool to watermark AI images.
Google's AI-text detection tool comes at a time when AI-powered misinformation is on the rise—as well as false-positive detections. About two-thirds of teachers reportedly use AI detection tools for student assignments and essays, and students using English as their second language have been victims of false detection.


