Immanuel Trummer

I am assistant professor for computer science at Cornell University and head of the Cornell Database Group. My research is generally about making data analysis more efficient (e.g., by leveraging reinforcement learning for query planning or approximate processing methods), or about making data access more user-friendly (e.g., via voice interfaces or via special-purpose natural language query interfaces).

I am always looking for outstanding students who combine strong formal background with excellent implementation skills. If interested, send a mail with your CV to itrummer@cornell.edu.

Current Projects

I am currently working on the following projects:

SkinnerDB
We exploit reinforcement learning to select (near-)optimal query plans. We use neither cost or cardinality models nor data statistics. Instead, we divide query execution into micro episodes and try different plans in different episodes. By measuring evaluation progress per time unit and optimally balancing exploration and expoitation in plan selections, we guarantee near-optimal expected execution cost for queries on large data. See here for more details, talks, and source code.
CiceroDB
We are working on several research projects around voice-based data access. Those projects fall into three broad categories: research on how to interpret user speech input more reliably, research on optimally summarizing trends in query results via voice output ("data vocalization"), and research on specializing query processing to voice interfaces (e.g., by interleaving system speaking time with processing time). Research outcomes are integrated into CiceroDB, a novel DBMS designed from the ground up for voice-based data access. See here for more details, talks, and publications.
AggChecker
Data are often summarized via text documents, examples include newspaper articles by data journalists, business reports, or scientific papers. Mistakes in data summaries often go by unnoticed as there is no time to verify each claim. In collaboration with Google NYC, we created a system, similar to a spell checker, that verifies consistency between natural language claims and relational data sets. See here for a live demo, talks, and publications.
Optimizer Testing
This line of work is targeted at supporting developers of query optimizers. To assess the quality of query optimizers (and to identify candidate areas for improvements), it is useful to compare plans produced by an optimizer to guaranteed optimal plans. Guaranteed optimal plans are difficult to find since we cannot rely on the optimizer cost or cardinality model. We are developing approaches to find guaranteed optimal plans via offline optimization. This requires executing plans or plan fragments to obtain guaranteed bounds on execution costs and intermediate result sizes. See here for more details, talks, and publications.
This list does not include several projects (in the areas of deterministic approximation and automated fact checking) that we started recently.

PhD Students

I am currently working with the following PhD students:

  • Saehan Jo
  • George Karagiannis
  • Jialing Pei
  • Ziyun Wei

Teaching

I am regularly teaching the following courses at Cornell:

  • CS 4320: Introduction to Database Systems
  • CS 4321: Practicum in Database Systems
  • CS 6320: Advanced Database Systems
  • CS 7390: Seminar in Database Systems

More details about those courses can be found here.

Service

I serve(d) in the following capacities:

  • Since recently, I serve as associate editor for SIGMOD Record
  • I serve(d) as reviewer for SIGMOD 2018, 2019, and 2020
  • I serve(d) as reviewer for VLDB 2018, 2019, and 2020

Publications

2020

  • CIDR 2020 BitGourmet: deterministic approximation via optimized bit selections. Saehan Jo, Immanuel Trummer.

2019

  • VLDB 2019 AggChecker: a fact-checking system for text summaries of relational data sets. Saehan Jo, Immanuel Trummer, Weicheng Yu, Xuezhi Wang, Cong Yu, Daniel Liy Niyati Mehta.
  • SIGMOD 2019 SkinnerDB: Regret-bounded query evaluation via reinforcement learning. Immanuel Trummer, Junxiong Wang, Deepak Maram, Samuel Moseley, Saehan Jo, Joseph Antonakakis.
  • SIGMOD 2019 A holistic approach for query evaluation and result vocalization in voice-based OLAP. Immanuel Trummer, Yicheng Wang, Saketh Mahankali.
  • SIGMOD 2019 Exact cardinality query optimization with bounded execution cost. Immanuel Trummer.
  • SIGMOD 2019 Verifying text summaries of relational data sets. Saehan Jo, Immanuel Trummer, Weicheng Yu, Xuezhi Wang, Cong Yu, Daniel Liu, Niyati Mehta.
  • CIDR 2019 Data Vocalization with CiceroDB. Immanuel Trummer.

Funding

I gratefully acknowledge funding from the following sources:

  • NSF-1910830: regret-bounded query evaluation via reinforcement learning
  • Google Faculty Research Award for data vocalization
  • Google Faculty Research Award for building an "anti-knowledge base"
  • Support by Huawei for research on deterministic approximation

News 2019

News 2018

  • Our paper "Data vocalization with CiceroDB" by Immanuel Trummer was accepted at CIDR 2019!
  • Our paper "Verifying text summaries of relational data sets" by Saehan Jo et al. was accepted at SIGMOD 2019!
  • Our paper "SkinnerDB: regret-bounded query evaluation via reinforcement learning" by Immanuel Trummer et al. was accepted at SIGMOD 2019!
  • Our paper "A holistic approach for query evaluation and result vocalization in voice-based OLAP" by Immanuel Trummer et al. was accepted at SIGMOD 2019!
  • Our paper "Exact cardinality query optimization with bounded execution cost" by Immanuel Trummer was accepted at SIGMOD 2019!
  • Congrats to Samuel Moseley for winning a VLDB 2018 NSF Travel Grant!
  • Congrats to Mark Bryan for winning a honorable mention for the 2018 CRA Outstanding Undergraduate Researcher Award!
  • Our paper "Vocalizing Large Time Series Efficiently" was accepted at VLDB 2018.
  • Our paper "SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning" was accepted at VLDB 2018.
  • A pre-print of our paper on computational fact checking is online.
  • Congrats to Mark Bryan for winning the JP Morgan BOOM Award 2018 for CiceroDB!