|
|
||||
Research InterestsMy research focuses broadly on data-oriented systems and the way they drive computing. Recently this includes distributed programming models, serverless computing, distributed consistency and isolation, data management for machine learning and data science, interactive data visualization and transformation, and query processing. My research is driven by collaborations with colleagues in a wide variety of fields including Programming Languages, Human-Computer Interaction, AI, Networking, Security, and Theoretical Computer Science. Current ProjectsDistributed Systems: The Hydro project is developing new techniques for the programmable cloud. Sub-projects include:
Data Management for Machine Learning: The machine learning lifecycle presents many data management problems.
Interactive Data Visualization: Data visualization systems merge language design, data processing and asynchronous event processing in service of human-centric data interaction. Current projects include:
Past ProjectsBOOM and : Orders Of Magnitude simpler code for the Cloud. d^p ("deep"): Data to the People, led to Trifacta, Captricity and MADlib. BayesStore: Probabilistic data management Declarative Networking and the P2 system Querying, monitoring, and networking using wireless sensor networks PIER: A peer-to-peer query engine based on distributed hash table (DHT) overlay technologies. Telegraph: An Adaptive Dataflow System for networked data and services. TinyDB: A query processing engine for ad-hoc wireless sensor networks. CONTROL: Interactive Analysis of Massive Datasets, including online aggregation, online data cleaning (Potter's Wheel), online data mining and scalable spreadsheets. GiST: Generalized Indexing (GiST for PostgreSQL, libgist), Access Method Profiling and Debugging (amdb), and Indexability Open Source Software |