Optimizing AI Summarization with Smart Data Selection

10.08.23 07:53 AM Comment(s) By Ines Almeida

Summarization is a key capability for many AI systems, from meeting note takers to research assistants. However, these AI models often produce low quality or inaccurate summaries. New research from Columbia University and Microsoft explores how better selection of training data can enhance the relevance and accuracy of summaries.


Key findings:

  • Just training on human reference summaries is not enough - AI models need to learn from their own outputs too.
  • Adding a second training stage using the model's own generated summaries leads to major gains.
  • The key is picking the right model outputs to include in this extra training data.
  • For improving relevance, the model was taught to produce summaries similar to those it already could generate, but nudged to make them even more relevant by meeting the high bar set by the automatic metrics.
  • For improving accuracy, the most effective training data pairs model-generated summaries that contain factual errors with human-rewritten versions that accurately convey the original text. This exposes the model to its own mistakes and shows it how to correct them.

  • These trends hold true across scientific topics like biology, chemistry and medicine.


Why it matters:


AI-generated text risks being irrelevant, repetitive or just plain wrong. Smarter training data selection could make AI assistants more useful.

The findings provide a blueprint to build better training sets. This can extend beyond summarization to any AI system that generates text.


However, current automatic evaluation methods are limited. More research is needed into quality metrics closely matched to human judgments.


This contributes to larger efforts around aligning AI outputs with desired human preferences. This will be crucial as text generation spreads to areas like science, medicine and engineering.


In summary, strategic selection of training examples is emerging as a powerful technique for improving key AI language systems. But human evaluation remains essential to guide research and measure true progress.


Sources:

arxiv.org

Introduction to LLMs for Business
Share -