Open
0 of 1 issue completedDescription
Description
2024 is almost gone. Time to define 2025 features. Here is some from my mind. Feel free to add more or comment
For more information about the achievements in 2024 Roadmap, please check #4709
Velox backend:
- Spark 4.0 support with minimal JDK 11 support in Gluten 1.5, Deprecate Spark 3.2 in Gluten 1.4 and JDK 8 in Gluten 1.5 ([CORE] Spark 4.0 support #8852)
- Spark 3.3.x/3.4.x/3.5.x upgrade
- Switch to Upstream Velox's official release ([VL] upstream OAP/Velox commits to upstream #8782)
- Stage level resource management(Stage level resource management to handle offheap/onheap memory conflict #4392 )
- Spill enhancement and performance ([VL] Spill related issues tracker #3030 )
- Hash table broadcast support in BHJ (https://docs.google.com/document/d/1upEby9aJnBcMKul7HS5agSr1zwFR63Iq4QL1nYxuFvU/edit?usp=sharing)
- Dictionary support in shuffle ([VL] Dictionary support in shuffle #8855)
- Multile core per task ([VL] Use multiple threads in the same executor #7810 )
- Pyspark Support including Python UDF/Arrow UDF
- Other Accelerator such as GPU, FPGA Support ([Core] Add GPU support in Gluten #8851)
- ARM architecture support
- Stability improvement, full fuzzer support, Result Mismatch issues ([VL] Result mismatch issues tracker #4652 )
- Spark Coverage including Operators and Functions
- Full Datalake support
- File Format support (Json)
- Performance improvement: Spill, Sort, HashAgg, HashJoin, ColumnarShuffle, and others
- Complex type in R2C ([VL] row2VeloxColumnar bad performance #7223)
- Gluten Indicator
- Gluten + Flink + Velox POC (experimental)
- ANSI support
- query trace
- Pushdown scan and filter to remote storage
Community:
- Major Release: 1.4/1.5/1.6/1.7 in Q1/Q2/Q3/Q4 2025
- Minor Release: On Demand
- Apache TLP Submission in Q4 2025
Sub-issues
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Track