A small model for text previews
TL;DR: We’re releasing a small text-to-text model that generates short content previews — like inbox snippets, news alerts, or notification headlines.
This model is available on HuggingFace or Minibase.ai for fine-tuning or API calls.
Most text-to-text summarizer models are designed to write entire paragraphs. But there is also a need for tiny text preview models, similar to those already being used to summarize Gmail subject lines, news alerts, and push notifications. We trained a lightweight model to do exactly that. It is small enough to run locally, fast enough for real-time use, and we trained it in less than an hour on Minibase.ai with no code.
We measured our model’s performance using a metric called ROUGE, which compares word overlap between a model’s output and human-written references.
ROUGE-1 measures single-word matches.
ROUGE-2 measures two-word phrases.
ROUGE-L measures longer sentence structures.
On the CNN/DailyMail benchmark, our model scored:
ROUGE-1: 30.2
ROUGE-2: 14.1
ROUGE-L: 23.8
Our model also has a measured compression ratio of 22%, which means its outputs are about one-fifth the length of its inputs. Its average latency is 218 ms.
Examples
Input:
The World Health Organization declared the monkeypox outbreak a global health emergency after cases rose sharply in Europe and the Americas.
More than 16,000 infections have been confirmed across 75 countries, and governments are rolling out vaccination programs.
Health officials emphasized that coordinated action will be crucial to contain the spread.
Output:
WHO declares global health emergency over surging monkeypox cases
Input:
The United States announced new sanctions on Russian banks, defense firms, and energy companies following recent attacks in eastern Ukraine.
President Biden said the measures were designed to isolate key parts of Russia’s economy and pressure Moscow to end the conflict.
European allies are expected to impose similar restrictions later this week.
Output:
US imposes new sanctions targeting Russia’s economy amid Ukraine war
Compared to summarizers like BART or Pegasus, the Minibase model is smaller and faster, but not more accurate. BART tends to produce longer summaries with higher ROUGE scores, for example. The trade-off here is that our model, at 368 MB, runs on CPUs and still captures the key topic cleanly. Like all Minibase models, it’s released under an Apache 2.0 license.
You can download it, fine-tune it, or deploy it directly from Minibase Cloud. To learn more or share results, join us on the Minibase Discord.