-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Agenda Presto Community Roadmap Discussion 3.9.2016
Martin Traverso edited this page Apr 15, 2016
·
12 revisions
- Facebook to summarize their roadmap for 2016
- Teradata to summarize their roadmap for 2016
- What would the community like to see that hasn't been mentioned in roadmap plans?
- Discuss how to provide a unified roadmap for which the community can collaborate
- Can we get the Pull Requests to a manageable number?
- Need the community to help go through and do initial reviews, suggest stale PRs to close, etc.
- Should we have a reoccurring meeting?
- Should we try other communication methods other than IRC and the mailing list
- Meetups
Features:
- grouping sets, aggregation,
- Additional data type support: int/small int/..,
- How to run the UDF functions that people write in hive in Presto
- Improve performance: we don’t see gains that we expected to see vs hive.
- custom version of ORC writer — fast implementation
- Scalability issues, stability issues
- Improve resource management - clients, users,
- Implement optimization to allow to run large workloads
- Internal projects: migrate pipelines to Presto — based on Presto running internal data stores
- Materialized query tables — can view them as MVs — to speed up queries
- Use cases to run on Raptor
- Make Presto more robust, stable and scalable
- Make query engine understand the physical layout of the data so that queries can run efficiently
Teradata
In Progress:
- Decimal
- Kerberos
- Non equi joins
- More subquery support
- Grant privileges
- JDBC version 1 (4.0, 4.1, 4.2)
- ODBC version 2
2016:
- Spill to disk
- Community Continuous Integration
- BI Tool Integration / Certification
- Broader SQL Support -- ex. correlated subqueries, TPC-DS and TPC-H
- Performance
- Security
Uber
- Geo Spatial Functions
- Nested Schema Evolution for Parquet
- New Parquet Reader
- Projection Pushdown for Structs in Parquet
- Upgrade to Parquet 1.8.1
Netflix
- Better resource management & scheduling: Workload isolation between interactive & non-interactive queries
- BI support: Tableau, Microstrategy, etc.. Happy to help with improving the support.
- Improve S3 integration: Assume role support.
- Improve Parquet support
Amazon
- Put up sandbox because of customers’ support
- Authorization
- Parquet reader improvements (e.g., filter pushdown for nested datasets)
- Support for LZO/Thrift
-
Create Wiki Page for Presto developers. High level roadmap and "epics" linked to github issues to make it easier to see where to get started.DONE: https://github.com/prestodb/presto/wiki/Roadmap