11
5

What sort of model performs best in the single document case? And in the multi document case? What are the most used/largest data sets? Is sentence extraction the best currently used approach, or are there others? What is the state of the art in targeted summarization, as well?

asked Jul 07 '10 at 10:16

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

edited Dec 03 '10 at 07:14

2

I'd be interested in learning about this as well. I hope people have finally move beyond mere sentence extraction, which is such an embarrassing hack. How is pulling out incoherent sentences here and there a "summarization"?

(Jul 08 '10 at 00:08) Frank

It is "summarization" in the loose sense that, by reading the summary, you can get a generally ok idea of what the original text collection was talking about.

I can think of ways of going beyond sentence extraction that shouldn't be very hard (chunk extraction, for example, mixed with a language model to cut out really nonsensical sentences), but I'm not sure if they're worth it/make sense.

I really hope someone in the area answers this question.

(Jul 08 '10 at 06:59) Alexandre Passos ♦
1

I agree it gives the reader an ok idea about the document(s). But it's like calling a simple word-to-word dictionary lookup a "translation". It does convey some idea of what the input document is about, but it does so by just concatenating incoherent stuff and is far from solving the task in a satisfactory way.

(Jul 08 '10 at 08:51) Frank

While not current state-of-the art, there are some interesting works that go beyond sentence extraction. Jing and McKeown's Cut-and-Paste summarization comes to mind (see also Hongyan Jing's phd thesis).

I wonder why people are not doing more of that.

(Jul 23 '10 at 01:22) yoavg

Text summarization is still missing in the ACL wiki, so if anyone wants to contribute there ...

http://www.aclweb.org/aclwiki/index.php?title=State_of_the_art

(Jul 28 '10 at 09:16) zeno

5 Answers:

ROUGE is based on n-gram overlap between human and automatic summaries. Since good summaries can have radically different words, ROUGE has been (rightly) criticized as being pretty flawed. For example, Liu and Liu found poor correlation between ROUGE and human ratings of summarization quality.

That said, you're right that people still use ROUGE and compare against SOTA using it. The best place to look for current SOTA results is on the DUC data sets. (Haghighi and Vanderwende use DUC-2007 for evaluation.)

To address some of these problems Nenkova and Passoneau developed Pyramid scoring. Pyramid uses the presences of Summary Content Units (SCUs) to evaluate summary quality. SCUs are subsentential clauses that have some propositional content. One limitation of this method (and it's a big one) is that, to date, SCUs need to be manually annotated.

In summarization, the trend has turned toward abstractive summarization (including paraphrase generation, sentence simplification, natural language generation) from extractive summarization. Two major frustrations with extractive summarization was that 1) the summaries were constrained by the sentences in the source document, so summaries weren't especially good and 2) it was really hard to beat top-N sentence baselines (for news documents anyway). There's a nice paper by Carenini comparing abstractive to extractive approaches.

answered Jul 08 '10 at 07:29

Andrew%20Rosenberg's gravatar image

Andrew Rosenberg
173772540

edited Jul 15 '10 at 16:54

ogrisel's gravatar image

ogrisel
498995591

An excellent place to go for the most current methods is the Text Analysis Conference summarization track which took over from the Document Understanding Conference. Papers from 2008 and 2009 are available at http://www.nist.gov/tac/publications/index.html and the 2010 conference will take place this Fall. While the Haghighi and Vanderwende paper mentioned above presents what seems to be a very powerful method, it, like almost every system presented at TAC, is still a sentence extraction algorithm. Many approaches use a simplistic form of sentence compression where redundant information is removed or other information is compressed such that the main idea is still there, but it's still sentence extraction. Other methods that are in use include almost every machine learning algorithm out there, from the simple to the complex, yet the most simple is still extremely powerful and has been in use since the 1950's: simply using term frequency metrics. I'd be very interested in hearing what other people are up to as this is my graduate research area. I'm currently looking into similar approaches to H&V using topic models and other generative document models to learn the most salient sentences in a corpus.

answered Jul 15 '10 at 20:34

Will%20Darling's gravatar image

Will Darling
135128

Just to get the ball rolling, the state-of-the art evaluation metric seems to be ROUGE. And as far as bayesian models for multiple document summarization go, the state of the art seems to be Haghighi and Vanderwende's paper.

answered Jul 07 '10 at 12:23

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

edited Jul 07 '10 at 12:24

An interesting and rather straightforward way to select sentences based on semantics was proposed in this paper that used latent semantic analysis for summarization. There also exist simple variants of this that can be applied for multi-document summarization and update summarization as well (see this paper).

answered Aug 13 '10 at 20:14

spinxl39's gravatar image

spinxl39
3698114869

edited Aug 13 '10 at 20:14

1

These papers can be seen as interesting "ideological parents" of the Haghigi & Vanderwende paper I linked in my answer. In the H&V paper, LSA is replaced by a more specialized version of LDA tuned to the summarization problem and by a content model (a sentence topic HMM, as in Barzilay & Lee Catching the drift...), and the search for diversity in the summaries was (which I think is one of the main contributions of the paper) replaced by searching for the set of sentences that minimizes the KL divergence between the summary model parameters and the document collection model parameters. It's nice to see how most intuitions about summarization fall naturally from this framework.

(Aug 13 '10 at 20:26) Alexandre Passos ♦

Yeah, indeed! :)

(Aug 13 '10 at 20:38) spinxl39

Perhaps not state of the art, but a simple and easy to understand approach that serves as a good starting point and a very reasonable baseline to compare to other techniques (and produces very reasonable results) is H.P. Luhn's approach in "The Automatic Creation of Literature Abstracts" dating back 50+ years. It's along the lines that Will Darling mentioned in an early comment: you basically find important sentences based upon a simple form of feature detection that takes into account distances between the most frequent words. You can find the original paper online, or you can check out an implementation of the algorithm that's presented in Mining the Social Web from its GitHub repository.

answered Feb 11 '11 at 18:37

ptwobrussell's gravatar image

ptwobrussell
1

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.