<?xml version="1.0" encoding="UTF-8" ?><!-- generator=Zoho Sites --><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><atom:link href="https://www.nownextlater.ai/Insights/tag/summarization/feed" rel="self" type="application/rss+xml"/><title>Now Next Later AI - Blog #summarization</title><description>Now Next Later AI - Blog #summarization</description><link>https://www.nownextlater.ai/Insights/tag/summarization</link><lastBuildDate>Wed, 26 Nov 2025 21:23:46 +1100</lastBuildDate><generator>http://zoho.com/sites/</generator><item><title><![CDATA[Optimizing AI Summarization with Smart Data Selection]]></title><link>https://www.nownextlater.ai/Insights/post/optimizing-ai-summarization-with-smart-data-selection</link><description><![CDATA[Summarization is a key capability for many Generative AI systems. New research explores how better selection of training data can enhance the relevance and accuracy of summaries.]]></description><content:encoded><![CDATA[<div class="zpcontent-container blogpost-container "><div data-element-id="elm_6A4DOO9sRBueZ4e8ABBw4A" data-element-type="section" class="zpsection "><style type="text/css"></style><div class="zpcontainer-fluid zpcontainer"><div data-element-id="elm_VBX7_EqSSLCCvLhgXG2I2g" data-element-type="row" class="zprow zprow-container zpalign-items- zpjustify-content- " data-equal-column=""><style type="text/css"></style><div data-element-id="elm_NKlpYzYdSeWnhI5zyJG1Xw" data-element-type="column" class="zpelem-col zpcol-12 zpcol-md-12 zpcol-sm-12 zpalign-self- "><style type="text/css"> [data-element-id="elm_NKlpYzYdSeWnhI5zyJG1Xw"].zpelem-col{ border-radius:1px; } </style><div data-element-id="elm_Z3xlqexMThGa8ksZsFJKMA" data-element-type="text" class="zpelement zpelem-text "><style> [data-element-id="elm_Z3xlqexMThGa8ksZsFJKMA"].zpelem-text { border-radius:1px; } </style><div class="zptext zptext-align-center " data-editor="true"><div style="color:inherit;text-align:left;"><p>Summarization is a key capability for many AI systems, from meeting note takers to research assistants. However, these AI models often produce low quality or inaccurate summaries. New research from Columbia University and Microsoft explores how better selection of training data can enhance the relevance and accuracy of summaries.</p><p><br></p><p>Key findings:</p><ul><li>Just training on human reference summaries is not enough - AI models need to learn from their own outputs too.</li><li>Adding a second training stage using the model's own generated summaries leads to major gains.</li><li>The key is picking the right model outputs to include in this extra training data.</li><li><span style="color:inherit;">For improving relevance, </span><span><span style="font-size:14px;"><span style="color:inherit;"><span style="font-weight:400;text-indent:0px;">the model was taught to produce summaries similar to those it already could generate, but nudged to make them even more relevant by meeting the high bar set by the automatic<span>&nbsp;</span>metrics.</span></span></span></span></li><li><div style="color:inherit;"><p><span style="color:inherit;">For improving accuracy, the most effective training data pairs model-generated summaries that contain factual errors with human-rewritten versions that accurately convey the original text. This exposes the model to its own mistakes and shows it how to correct them.</span></p></div></li><li>These trends hold true across scientific topics like biology, chemistry and medicine.</li></ul><p><br></p><p>Why it matters:</p><p><br></p><p>AI-generated text risks being irrelevant, repetitive or just plain wrong. Smarter training data selection could make AI assistants more useful.</p><p>The findings provide a blueprint to build better training sets. This can extend beyond summarization to any AI system that generates text.</p><p><br></p><p>However, current automatic evaluation methods are limited. More research is needed into quality metrics closely matched to human judgments.</p><p><br></p><p>This contributes to larger efforts around aligning AI outputs with desired human preferences. This will be crucial as text generation spreads to areas like science, medicine and engineering.</p><p><br></p><p>In summary, strategic selection of training examples is emerging as a powerful technique for improving key AI language systems. But human evaluation remains essential to guide research and measure true progress.</p><p><br></p><p>Sources:</p><p><a href="https://arxiv.org/pdf/2305.07615.pdf" title="arxiv.org" rel="">arxiv.org</a><br></p><p></p></div><p></p></div>
</div><div data-element-id="elm_WVixqQRiQq2d3ZvqN_gAHA" data-element-type="image" class="zpelement zpelem-image "><style> @media (min-width: 992px) { [data-element-id="elm_WVixqQRiQq2d3ZvqN_gAHA"] .zpimage-container figure img { width: 500px ; height: 500.00px ; } } @media (max-width: 991px) and (min-width: 768px) { [data-element-id="elm_WVixqQRiQq2d3ZvqN_gAHA"] .zpimage-container figure img { width:500px ; height:500.00px ; } } @media (max-width: 767px) { [data-element-id="elm_WVixqQRiQq2d3ZvqN_gAHA"] .zpimage-container figure img { width:500px ; height:500.00px ; } } [data-element-id="elm_WVixqQRiQq2d3ZvqN_gAHA"].zpelem-image { border-radius:1px; } </style><div data-caption-color="" data-size-tablet="" data-size-mobile="" data-align="center" data-tablet-image-separate="false" data-mobile-image-separate="false" class="zpimage-container zpimage-align-center zpimage-size-medium zpimage-tablet-fallback-medium zpimage-mobile-fallback-medium "><figure role="none" class="zpimage-data-ref"><a class="zpimage-anchor" href="/introduction-to-large-language-models-for-business-leaders-book" target="" rel=""><picture><img class="zpimage zpimage-style-none zpimage-space-none " src="/12.png" width="500" height="500.00" loading="lazy" size="medium" alt="Introduction to LLMs for Business "/></picture></a></figure></div>
</div></div></div></div></div></div> ]]></content:encoded><pubDate>Thu, 10 Aug 2023 07:53:06 +1000</pubDate></item></channel></rss>