Here’s a list of papers accepted to our workshop. Our OpenReview venue can be accessed here where reviews and meta-reviews of accepted papers are available.

In defense of the paper ORAL
Owen Lockwood
The root cause of hindrances in the accessibility of machine learning research lies not in the paper workflow but within the misaligned incentives behind the publishing and research processes.
The machine learning publication process is broken, of that there can be no doubt. Many of these flaws are attributed to the current workflow; LaTeX to PDF to reviewers to camera ready PDF. This has understandably resulted in the desire for new forms of publications; ones that can increase inclusively, accessibility and pedagogical strength. However, this venture fails to address the origins of these inadequacies in the contemporary paper workflow. The paper, being the basic unit of academic research, is merely how problems in the publication and research ecosystem manifest; but is not itself responsible for them. Not only will simply replacing or augmenting papers with different formats not fix existing problems; when used as a band-aid without systemic changes, will likely exacerbate the existing inequities. In this work, we argue that the root cause of hindrances in the accessibility of machine learning research lies not in the paper workflow but within the misaligned incentives behind the publishing and research processes. We discuss these problems and argue that the paper is the optimal workflow. We also highlight some potential solutions for the incentivization problems.
Curating Publications as Artefacts — Exploring Machine Learning Research in an Interactive Virtual Museum ORAL
Beatrice Gobbo, Mennatallah El-Assady
Envisioning a Machine Learning Papers Museum as a metaphorical solution for exploring and disseminating scientific publications.
The need for innovating scientific publications is felt across various research fields. In the last thirty years, the publishing process is accelerating, and new research is coming out every day. In Machine Learning, some attempts have been made to keep track of recent publications and to enhance open access articles for allowing authors to integrate multimedia contents. However, it is still hard to compare selected articles exhaustively. Thus, we envision a Machine Learning Papers Museum as a metaphorical solution for providing a digital space where users can explore publication collections in guided and serendipitous ways.
Machine learning research communication via illustrated and interactive web articles ORAL
J Alammar
We describe a workflow for creating a spectrum of machine learning research communication artifacts optimized to maximize the clarity of scientific communication, advance the fronts of explainability and interpretability, as well as empower the community to reproduce research software.
The recent explosion in machine learning research activity poses challenges for both researchers who aim to widely disseminate their work, as well as to readers who find it challenging to keep up with the onslaught of new research ideas. In this paper, we describe a workflow for creating a spectrum of machine learning research communication artifacts optimized to maximize the clarity of scientific communication, advance the fronts of explainability and interpretability, as well as empower the community to reproduce research software. The workflow describes creating a spectrum of communication artifacts including visuals, animations, interactive explorables, reproducible notebooks, open-source software, and software packages. The articles produced by this workflow have explained cutting-edge-ML research to a large audience and were read over three million times.
I❤LA -- Compilable Markdown for Linear Algebra ORAL
Yong Li, Shoaib Kamil, Alec Jacobson, Yotam Gingold
We propose an exhibit on our new programming language I❤️LA which compiles markdown-like linear algebra code to C++, Python or LaTeX.
I❤️LA is a compilable markdown for math. It can generate working code in C++ and Python (and more to come). The same I❤️LA code can also generate LaTeX which is in turn rendered as beautifully typeset math. The I❤️LA code creates a publishable artifact and reference implementations. We believe I❤️LA can improve expositional clarity, code reproducibility and interoperability, and scientific education. We focus our initial efforts on mathematical expressions found in the wild within the Computer Graphics community, but plan to extend our work to the larger Machine Learning and Computer Science community. In our proposed Rethinking ML Papers exhibit, we will demonstrate results of our large-scale study and an interactive demo application.
You Only Write Thrice -- Creating Documents, Computational Notebooks and Presentations From a Single Source ORAL
Kacper Sokol, Peter Flach
How to automatically generate interactive documents, slides and computational notebooks from a single markdown source.
Academic trade requires juggling multiple variants of the same content published in different formats -- manuscripts, presentations, posters and computational notebooks. The need to track versions to accommodate for the write--review--rebut--revise life-cycle adds another layer of complexity. We propose to significantly reduce this burden by maintaining a single source document in a version-controlled environment (such as git), adding functionality to generate a collection of output formats popular in academia. To this end, we utilise various open-source tools from the Jupyter scientific computing ecosystem and operationalise selected software engineering concepts. We offer a proof-of-concept workflow that composes Jupyter Book (an online document), Jupyter Notebook (a computational narrative) and reveal.js slides from a single markdown source file. Hosted on GitHub, our approach supports change tracking and versioning, as well as a transparent review process based on the underlying code issue management infrastructure.
Exhibit -- Converting my PhD thesis to HTML
Damien Desfontaines
Converting large LaTeX docs to HTML is hard and it really shouldn't be
I converted my PhD thesis, titled Lowering the cost of anonymization, into HTML. The motivation and process are described in a blog post; and I was told that this work could constitute an interesting exhibit submission to the 2021 Rethinking ML Papers workshop at ICLR.
Diagrammatic summaries for neural architectures
Guy Clarke Marshall, Caroline Jay, André Freitas
Diagrams are effective for communicating about ML systems and should be placed centrally within the publication process
This paper advocates for diagrammatic summary publications for machine learning system architecture papers. We review existing diagram-centric scholarly practices, and summarise relevant studies on neural network system architecture diagrams. We subsequently propose three opportunities -- Diagram guidelines, diagrammatic system summary publications, and the community creation of a formal diagram standards, which could be integrated with existing LaTeX + PDF publication processes.
ModulOM -- Disseminating Deep Learning Research with Modular Output Mathematics
Maxime Istasse, Kim Mens, Christophe De Vleeschouwer
Solving a task with a deep neural network requires an appropriate formulation of the underlying inference problem. A formulation defines the type of variables output by the network, but also the set of variables and functions, denoted output mathematics, needed to turn those outputs into task-relevant predictions. Despite the fact that the task performance may largely depend on the formulation, most deep learning experiment repositories do not offer a convenient solution to explore formulation variants in a flexible and incremental manner. Software components for neural network creation, parameter optimization or data augmentation, in contrast, offer some degree of modularity that has proved to facilitate the transfer of know-how associated to model development. But this is not the case for output mathematics. Our paper proposes to address this limitation by embedding the output mathematics in a modular component as well, by building on multiple inheritance principles in object-oriented programming. The flexibility offered by the proposed component and its added value in terms of knowledge dissemination are demonstrated in the context of the Panoptic-Deeplab method, a representative computer vision use case.
SPICES -- Survey papers as interactive cheatsheet embeddings
Vinay Uday Prabhu, Matthew McAteer, Ryan Teehan
This paper presents a procedure and a gallery demonstrating formatting of survey papers as interactive cheat-sheet embeddings
Papers are hard to write. Survey papers are just that much harder. From the authors' perspective, challenges include the responsibility to not erase out important work being done by (sometimes) adversarially aligned research groups, finding the right semantic clustering to sub-categorize individual contributions, controlling for the verbosity and length of the final paper, ensuring an optimal mixing of personal opinion and the innate narratives in the paper(s) being cited, version controlling, ease of updating, and also the aesthetics of presentation. From the reader's viewpoint, challenges include ease of reading, single-snapshot summarizability, portability, and being given the agency to edit or fork their own copies. Taking cues from the emergence of the cheat-sheet culture in machine learning and the virtues of living editable documentation and version control, we propose an interactive and live SVG format based methodology that we term SPICE -- Survey Papers as Interactive Cheat-sheet Embedding. We cover the technical details behind constructing SPICEs and present an example gallery covering `hot button' areas in machine learning such as Out of distribution detection, the `All you need' histrionics and Transformer architectures.
On augmenting the references section with a citation network visualization
Putra Manggala, Tigran Atoyan, Gracia Samosir, Jan Varsava, Johannes Ruf
Augmenting the references section with a citation network visualization, based on the outcomes of a user experience study.
Researchers often find relevant articles by looking at the references section. We conducted user interviews with researchers about their workflow and their needs when carrying out literature research. Based on this study, we identify a set of problems encountered by researchers. We then propose to embed classified citation networks into articles as a solution, and to complement this graph with optional comments about references by authors. We demonstrate this idea by implementing it for this article. We argue that our solution helps increase inclusivity and improves the efficiency of reading scientific articles.
Fairness and Friends
Falaah Arif Khan, Eleni Manis, Julia Stoyanovich
Fairness and Friends is the second volume of the 'Data, Responsibly' comic series, covering issues of bias in algorithmic systems, fairness in machine learning and broader doctrines of equality of opportunity and justice from political philosophy.
Recent interest in codifying fairness in Automated Decision Systems (ADS) has resulted in a wide range of formulations of what it means for an algorithm to be "fair". Most of these propositions are inspired by, but inadequately grounded in, scholarship from political philosophy. This comic aims to correct that deficit. We begin by setting up a working definition of an 'Automated Decision System' (ADS) and explaining 'bias' in outputs of an ADS. We then critically evaluate different definitions of fairness as Equality of Opportunity (EOP) by contrasting their conception in political philosophy (such as Rawls's fair EOP and formal EOP) with the proposed codification in Fair-ML (such as statistical parity, equality of odds and accuracy) to provide a clearer lens with which to view existing results and to identify future research directions. We use this framing to reinterpret the impossibility results as the incompatibility between different EOP doctrines and demonstrate how political philosophy can provide normative guidance as to which notion of fairness is applicable in which context. We conclude by highlighting justice considerations that the fair-ML literature currently overlooks or underemphasizes, such as Rawls's broader theory of justice, which supplements his EOP principle with a principle guaranteeing equal rights and liberties to all citizens in a free and democratic society.
Scientific dissemination via comic strip -- A case study with SacreBLEU
Matt Post
Consider using a comic strip for your paper's poster presentation.
Comic strips are a naturally appealing medium which provide a visually-attractive means for situating scientific results within a narrative. Although they may not be relevant to all situations and can be time-consuming to produce, they also provide unique opportunities for humor and levity that may be an important tool in disseminating and convincing an audience of the merits of a paper. Furthermore, their decomposition into panels makes it easy to annotate them using standard accessibility tools for images. This paper presents the case for presenting scientific posters as comic strips, using the author's 2018 SacreBLEU poster as a motivating example.
Convolution Can Incur Foveation Effects
Jun Yuan, Bilal Alsallakh, Narine Kokhlikyan, Vivek Miglani, Orion Reblitz-Richardson
An interactive visualization to illustrate potential foveation effects incurred during convolution
This exhibit demonstrates how boundary treatment in convolutional networks can incur foveation effects -- Impacted pixels have fewer ways to contribute to the computation than central pixels. Different padding mechanisms can either eliminate or aggravate these effects, which is made obvious by an interactive visualization.
Open-source blogging with Automunge
Nicholas Teague
Exhibits demonstrating best practices for multimedia explanatory communications from the Automunge library.
The developers of the Automunge open source platform for tabular data preprocessing have taken a somewhat unorthodox approach to documentation and communications, making use of multimedia, blogging, tweets, jupyter notebooks, as well as music and photography in publication. This submission will offer an exhibited excerpt of such communication practices, featuring elements of multimedia videos with narration, accompanied with hand drawn slides and transcript, presented as both a brief introduction and extended walkthrough. We believe this form of presentation is a very accessible low cost option to communicate complex subject matter in a concise and accessible form. Further examples are also provided in the references.
The Mimosa Manifesto
Lana Sinapayen
I present a web platform ("Mimosa") for open collaboration in science, designed to ultimately replace journal papers and peer review.
Science is a debate. Debates happen where there is wiggle room for interpretation. There is no debate when all parties agree, or when all parties know why they disagree. Scientific debates can be settled by agreeing on an experimental protocol. Good protocols identify wiggle room and preemptively get rid of it, by fixing the interpretation of experimental results before the experiment proceeds. “Are doctors transmitting deadly illnesses from cadavers to birthing mothers? Have some doctors wash their hands after autopsies. Let us agree that if their patients have better survival rates than usual, it means that infections travel on the hands of doctors (Carter 1985).” Experimental results might tell you which way the settlement goes, but ideally the debate itself ends with the protocol. From this point of view, Science is the art of defining convincing protocols -- scientific papers are more interesting and more rigorous when they are written by two people who start out genuinely disagreeing. Mimosa is an attempt at harnessing both support and disagreement in science into a productive, collaborative format. Mimosa also tries to address many of the numerous recognised issues within the current format for sharing science, born at a different time and for the wrong reasons. When it first started, Wikipedia was greeted with suspicion. It is now a major platform for finding information, used by all demographics. Wikipedia has a famous rule -- "No original research.” Mimosa aspires to be that free, open-collaborative online platform created and maintained by a community of volunteer contributors, dedicated to original research.
Interactive Media for Understanding ML Methods -- A Case-Study on Graph Neural Networks
Ameya Daigavane, Balaraman Ravindran, Gaurav Aggarwal
Interactive media can be very useful to understand the fundamentals of graph neural networks.
We demonstrate the advantages of an interactive medium for explaining the key principles and mathematical machinery behind graph neural networks. We discuss the challenges we faced while creating an expository article on this topic using interactive elements.