# ICML / Peer Review 2013

Open for public discussion

ICML / Peer Review 2013 operates on an open reviewing model. Please feel free to engage in public discussion of the submitted papers during and after the review period.

#### Submitted Papers

#### Accepted for Oral Presentation

Fairness in Assignment Markets with Dual Decomposition
Bert Huang
07 May 2013 ICML 2013 Workshop on Peer Reviewing and Publishing Models
In this work, we present a market design for assignment problems that computes a globally optimal solution by adjusting incentives. Such markets can help in settings such as the assignment of peer-reviewers to submitted academic articles, assignment of tutors to students, or online matchmaking services. In these settings, each assignment has some reward value, and existing strategies for achieving high global reward involve either adjustments to greedy choices by agents or global optimization of estimated reward values. The benefit of maintaining a market is that we combine benefits from these methods in a principled way. The agents make incentivized greedy decisions, which is ideal because they understand their reward functions best, while the incentives push their decisions toward the global optimum. We update the incentives by relating the assignment market to the standard dual of the ($b$-) matching linear programming relaxation. We evaluate our proposed system on simulations, demonstrating that the market quickly improves the global reward.

The Toronto Paper Matching System: An automated paper-reviewer assignment system
Laurent Charlin, Richard S. Zemel
14 May 2013 ICML 2013 Workshop on Peer Reviewing and Publishing Models
One of the most important tasks of conference organizers is the assignment of papers to reviewers. Reviewers' assessments of papers is a crucial step in determining the conference program, and in a certain sense to shape the direction of a field. However this is not a simple task: large conferences typically have to assign hundreds of papers to hundreds of reviewers, and time constraints make the task impossible for one person to accomplish. Furthermore other constraints, such as reviewer load have to be taken into account, preventing the process from being completely distributed. We built the first version of a system to suggest reviewer assignments for the NIPS 2010 conference, followed, in 2012, by a release that better integrated our system with Microsoft's popular Conference Management Toolkit (CMT). Since then our system has been widely adopted by the leading conferences in both the machine learning and computer vision communities. This paper provides an overview of the system, a summary of learning models and methods of evaluation that we have been using, as well as some of the recent progress and open issues.

Taking Advantage of Out-of-Corpus Information for Citation Network Clustering
Steven H Lee, Taesun Moon, Hal Daume III
07 May 2013 ICML 2013 Workshop on Peer Reviewing and Publishing Models
In this paper we explore the use of several popular clustering and graph partitioning algorithms as a method of generating clusters of related scientific documents and suggest a simple graph augmentation technique for taking advantage of external information. We show that by hallucinating nodes for scientific documents that are cited but not present in the original data set, we can improve performance of clustering algorithms.

EJMS (Electronic Journal Management System)
Valdas Dičiūnas, Miroslav Šeibak, Vidas Daudaravičius, Valentinas Kriaučiukas
07 May 2013 ICML 2013 Workshop on Peer Reviewing and Publishing Models
EJMS is a web-based manuscript submission and peer-review system for scholar societies and institutions. Its advantages are conciseness of menu windows and preliminary manuscript evaluation function. The last one allows journal editor to ask some editorial board members or referees of their first-glance opinion on a submission. EJMS may be integrated into a large package of vendor services helping the authors to improve the quality of manuscript and offering the innovative production workflow to publishers.

The ACL Anthology Network Corpus as a Resource for NLP-based Bibliometrics
16 May 2013 ICML 2013 Workshop on Peer Reviewing and Publishing Models
We introduce the ACL Anthology Network (AAN), a comprehensive manually curated networked database of citations, collaborations, and summaries in the field of Computational Linguistics. In addition, we present a number of studies and applications that use AAN. We also present several ideas for who these applications can be applied to aid peer reviewing processes.

The Benefits of Double-Blind Review
Hanna Wallach
21 May 2013 ICML 2013 Workshop on Peer Reviewing and Publishing Models
In this talk, I will discuss some of the reasons why moving to completely open peer review systems, abolishing double-blind reviewing procedures, may not be advantageous for women, minorities, and researchers at less prestigious institutions. Specifically, I will provide a high-level overview of recent research on double-blind reviewing, implicit bias, and stereotype threat.

Open Scholarship and Peer Review: a Time for Experimentation ￼
David Soergel, Adam Saunders, Andrew McCallum
13 May 2013 ICML 2013 Workshop on Peer Reviewing and Publishing Models
Across a wide range of scientific communities, there is growing interest in accelerating and improving the progress of scholarship by making the peer review process more open. Multiple new publication venues and services are arising, especially in the life sciences, but each represents a single point in the multi-dimensional landscape of paper and review access for authors, reviewers and readers. In this paper, we introduce a vocabulary for describing the landscape of choices regarding open access, formal peer review, and public commentary. We argue that the opportunities and pitfalls of open peer review warrant experimentation in these dimensions, and discuss desiderata of a flexible system. We close by describing OpenReview.net, our web-based system in which a small set of flexible primitives support a wide variety of peer review choices, and which provided the reviewing infrastructure for the 2013 International Conference on Learning Representations. We intend this software to enable trials of different policies, in order to help scientific communities explore open scholarship while addressing legitimate concerns regarding confidentiality, attribution, and bias.

Large-scale author coreference via hierarchical entity representations
Michael L Wick, Ari Kobren, Andrew McCallum
07 May 2013 ICML 2013 Workshop on Peer Reviewing and Publishing Models
Large-scale author coreference, the problem of ascribing research papers to real-world authors in bibliographic databases, is critical for mining the scientific community. However, traditional pairwise approaches, which measure coreference similarity between pairs of author mentions, scale poorly to large databases; and streaming approaches, which lack the ability to retroactively correct errors, can suffer from chronically low accuracy. In this paper we present a hierarchical model for solving author coreference that overcomes these issues. First, our model enables scalability over rich entity representations by compactly organizing the mentions of each author into trees. Second, we employ Markov chain Monte Carlo (MCMC) inference which is able to retroactively correct existing coreference errors when processing new mentions. We validate these two properties empirically, and demonstrate further scalability through asynchronous parallel MCMC (allowing us to scale to all 150,000,000 author mentions in Web of Science).

A New Dataset for Fine-Grained Citation Field Extraction
Sam Anzaroot, Andrew McCallum
09 May 2013 ICML 2013 Workshop on Peer Reviewing and Publishing Models
Citation field extraction entails segmenting a citation string into its constituent parts, such as title, authors, publisher and year. Despite the importance of this task, there is a lack of well-annotated citation data. This paper presents a new labeled dataset for citation extraction that, in comparison to the previous standard dataset, exceeds four-times more data, sup- plies detailed nested labels rather than coarse-grained flat labels, and is derived from four different academic fields rather than one. We describe our new dataset in detail, and provide baseline experimental results from a state-of-the-art extraction method.

Two Publication Models
David McAllester
20 May 2013 ICML 2013 Workshop on Peer Reviewing and Publishing Models
This white paper presents two publication models --- the cycle model used in ICML 2013 and the blog model which can be implemented independent of formal conferences and journals.