“Modeling the Lifespan of Discourse Entities with Application to Coreference Resolution” by Marie-Catherine de Marneffe, Marta Recasens and Christopher Potts

“A discourse typically involves numerous entities, but few are mentioned more than once. Distinguishing those that die out after just one mention (singleton) from those that lead longer lives (coreferent) would dramatically simplify the hypothesis space for coreference resolution models, leading to increased performance. To realize these gains, we build a classifier for predicting the singleton/coreferent distinction…”

“Identifying Aspects for Web-Search Queries” by F. Wu, J. Madhavan and A. Halevy

“Many web-search queries serve as the beginning of an exploration of an unknown space of information, rather than looking for a specific web page. To answer such queries effec- tively, the search engine should attempt to organize the space of relevant information in a way that facilitates exploration.

We describe the Aspector system that computes aspects for a given query. Each aspect is a set of search queries that together represent a distinct information need relevant to the original search query. To serve as an effective means to explore the space, Aspector computes aspects that are orthogonal to each other and to have high combined coverage…

Automating the Data Scientists by Tom Simonite

Software could automate some of the work performed by such data scientists, in hopes of making sophisticated data skills more widely available. 

Computers run complex mathematical operations on large collections of data, and selling data analysis software is a growing business. But human creativity and expertise is still needed to choose and deploy the methods that can explain the patterns in a data set.

The automatic statistician is one of a handful of tools being built to automate some of that expertise.  more…

“Efficient Planning under Uncertainty with Macro-actions” by R. He, E. Brunskill and N. Roy

“Deciding how to act in partially observable environments remains an active area of research. Identifying good sequences of decisions is particularly challenging when good control performance requires planning multiple steps into the future in domains with many states. Towards addressing this challenge, we present an online, forward-search algorithm called the Posterior Belief Distribution (PBD). PBD leverages a novel method for calculating the posterior distribution over beliefs that result after a sequence of actions is taken, given the set of observation sequences that could be received during this process. This method allows us to efficiently evaluate the expected reward of a sequence of primitive actions, which we refer to as macro-actions…”

“Multiagent Learning in Large Anonymous Games” by I. A. Kash, E. J. Friedman and J. Y. Halpern (2011)

In large systems, it is important for agents to learn to act effectively, but sophisticated multi-agent learning algorithms generally do not scale. An alternative approach is to find restricted classes of games where simple, efficient algorithms converge. It is shown that stage learning efficiently converges to Nash equilibria in large anonymous games if best-reply dynamics converge. Two features are identified that improve convergence. First, rather than making learning more difficult, more agents are actually beneficial in many settings. Second, providing agents with statistical information about the behavior of others can significantly reduce the number of observations needed.