Poonam Kumari

Online Data Interaction Lab
University at Buffalo

Contact:
12 Flickinger ct, Apt F, Amherst, NY, 14228
poonamku at buffalo dot edu

Research

My research focuses on visualizing uncertainity in incomplete databases to help users make an informed decision. This page describes my current research directions.

I am fortunate to work with exceptional collaborators Oliver Kennedy, William Spoth, Gourab Mitra, and Lisa Lu.

Current Research

Visualizing uncertainity in incomplete databases

The process of going from a raw dataset to an analytical answer involves an extensive data preparation process where analysts identify problems or unexpected structural features of the data. Some 'optimistic' data analytics tools (e.g., Pandas) automate aspects of this process through simple heuristics optimized for the common case (e.g., ignore malformed source data). Ironically, this automation often requires more from the user, as they must now manually inspect the source data for potential errors that might be obscured by the system's heuristics. In this paper, we explore the design of an interface that carries the benefits of both optimistic systems (i.e., easy access to answers assuming common-case heuristics are valid) and pessimistic systems (i.e., greater trust knowing that values shown to the user are error-free). We fit the resulting interface into a recently proposed data preparation and exploration system called Vizier, which links documentation (e.g., of potential errors) to fragments of affected data. This additional information reduces analyst's workload by focusing attention on assumptions relevant to analysis, but risks biasing any decision the user makes.

We explore how different visualization techniques affect perceived data quality, accuracy and decision confidence through IRB approved user studies and interviews with users/analysts.

Poonam Kumari. Make Informed Decisions:Understanding Query Results from Incomplete Databases. Proceedings of the VLDB 2019 PhD Workshop.
Poonam Kumari and Oliver Kennedy. The Good and Bad Data. Proceedings of the VLDB Endowment 2017.
Poonam Kumari, Said Achmiz and Oliver Kennedy>. Communicating Data Quality in On-Demand Curation. Proceedings of the 11th VLDB Workshop on Quality in Databases 2016, VLDB 2016.

Loki: Streamlining Integration and Enrichment

Data scientists frequently transform data from one form to another while cleaning, integrating, and enriching datasets. Writing such transformations, or "mapping functions" is time-consuming and often involves significant code re-use. Unfortunately, when every dataset is slightly different from the last, finding the right mapping functions to re-use can be equally difficult. In this paper, we propose "Link Once and Keep It" (Loki), a system which consists of a repository of datasets and mapping functions and relates new datasets to datasets it already knows about, helping a data scientist to quickly locate and re-use mapping functions she developed for other datasets in the past. Loki represents a first step towards building and re-using repositories of domain-specific data integration pipelines.

William Spoth, Poonam Kumari, Oliver Kennedy and Fatemah Nargessian. Loki: Streamlining Integration and Enrichment. HILDA: Workshop on Human-In-the-Loop Data Analytics, 2020.