DOKUMEN123.COM

Review waiting, please be patient.

This may take 3 months or more, since drafts are reviewed in no specific order. There are 4,415 pending submissions waiting for review.

If the submission is accepted, then this page will be moved into the article space.
If the submission is declined, then the reason will be posted here.
In the meantime, you can continue to improve this submission by editing normally.

Where to get help

If you need help editing or submitting your draft, please ask us a question at the AfC Help Desk or get live help from experienced editors. These venues are only for help with editing and the submission process, not to get reviews.
If you need feedback on your draft, or if the review is taking a lot of time, you can try asking for help on the talk page of a relevant WikiProject. Some WikiProjects are more active than others so a speedy reply is not guaranteed.

How to improve a draft

Wikipedia:Contributing to Wikipedia – a basic overview on how to edit Wikipedia.
Help:Wikitext – how to use the markup
Help:Referencing for beginners – how to include references
Wikipedia:Article development – how to develop your article
Wikipedia:Writing better articles – how to improve your article
Wikipedia:Verifiability – make sure your article includes reliable third-party sources

You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article.

Improving your odds of a speedy review

To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags.

Add tags to your draft

Editor resources

Easy tools: Citation bot (help) | Advanced: Fix bare URLs

Reviewer tools

Instructions · What links here · Predicate pushdown (talk: + · bio) · (log) · Copyvios report · reFill · Citation Bot · (Search: Google, Wikipedia) · Submitted 29 days ago by Daemyth (talk: D · +) · Last edited 29 days ago by Daemyth

Comment: The red-linked category is strange. Is this AI-generated? —pythoncoder (talk | contribs) 19:01, 8 May 2026 (UTC)

Comment: Not AI, just early brainstorming that I'd put down initially and forgot to remove because it was at the very bottom. Sorry about that! Removed now.

Predicate pushdown is a database management and query optimization technique that improves performance by implementing filters (also called selections) while the data is being read from the storage layer. In contrast to traditional filtering, where a system processes an entire dataset and then removes irrelevant rows in memory, predicate pushdown applies the filter as close to the source as possible. This results in significantly less data requiring transfer across networks and allocation in system memory.^[1]^[2]^[3]

Overview

In database management systems, predicates are expressions that evaluate to a Boolean value (e.g., WHERE age > 21).^[4] Early optimization engineers recognized that the primary bottleneck in query execution was often the latency involved in moving data from the physical disk into RAM.

With the advent of distributed computing and cloud-native storage (such as Amazon S3), this bottleneck was magnified because the entire dataset now had to travel across a network interface before it could be processed. Predicate pushdown addresses this by shifting the computational burden of filtering to the storage engine or remote node, ensuring only the relevant subset of data consumes bandwidth and memory.

This technique is an application of predication and is closely associated with data skipping, where metadata is used to bypass irrelevant data blocks entirely.^[5]

Implementation

The effectiveness of a pushdown depends on the ability of the underlying storage format to understand and execute the predicate.

Metadata and data skipping

Modern columnar formats like Apache Parquet and ORC implement pushdown by storing summary statistics at the file, footer, or "row group" level. These statistics, often called zone maps, store the minimum and maximum values for each column within a specific block.^[6] If a user queries WHERE price > 100 and a block's zone map indicates a max_price of 80, the entire block is skipped without being read.

Join and Bloom filter pushdown

In complex queries involving joins, systems may use Bloom filters. A filter representing the keys from one side of a join is "pushed" down to the scan of the other table. This allows the system to discard rows that would not satisfy the join condition before they are sent over the network for shuffling (a common operation in Apache Spark).

Limitations

Pushing a predicate down is not always possible. Complex User-Defined Functions (UDFs) cannot always be interpreted by the query optimizer, preventing it from translating the filter into a format the storage layer can execute directly.^[7]^[8]

References

^ "Predicate Pushdown". QuestDB. Retrieved 8 May 2026.
^ "Demystifying Predicate Pushdown: A Guide to Optimized Database Queries | Airbyte". airbyte.com. Retrieved 8 May 2026.
^ "The power of predicate pushdown". www.pola.rs. Retrieved 8 May 2026.
^ "Predicates - SQL Server". learn.microsoft.com. Retrieved 8 May 2026.
^ Ta-Shma, Paula; Khazma, Guy; Lushi, Gal; Feder, Oshrit (10 December 2020). "Extensible Data Skipping". 2020 IEEE International Conference on Big Data (Big Data). pp. 372–382. arXiv:2009.08150. doi:10.1109/BigData50022.2020.9377740. ISBN 978-1-7281-6251-5.
^ Kuiper, Laurens (22 January 2025). "Query Engines: Gatekeepers of the Parquet File Format". DuckDB. Retrieved 8 May 2026.
^ Braams, Boudewijn (December 2018). Predicate Pushdown in Parquet and Apache Spark (PDF) (Masters thesis). Universiteit van Amsterdam. Retrieved 8 May 2026.
^ Yan, Cong; Lin, Yin; He, Yeye (20 June 2023). "Predicate Pushdown for Data Science Pipelines". Proc. ACM Manag. Data. 1 (2): 136:1–136:28. doi:10.1145/3589281. Retrieved 8 May 2026.

Category:Database management systems

[1] "Predicate Pushdown". QuestDB. Retrieved 8 May 2026.

[2] "Demystifying Predicate Pushdown: A Guide to Optimized Database Queries | Airbyte". airbyte.com. Retrieved 8 May 2026.

[3] "The power of predicate pushdown". www.pola.rs. Retrieved 8 May 2026.

[4] "Predicates - SQL Server". learn.microsoft.com. Retrieved 8 May 2026.

[5] Ta-Shma, Paula; Khazma, Guy; Lushi, Gal; Feder, Oshrit (10 December 2020). "Extensible Data Skipping". 2020 IEEE International Conference on Big Data (Big Data). pp. 372–382. arXiv:2009.08150. doi:10.1109/BigData50022.2020.9377740. ISBN 978-1-7281-6251-5.

[6] Kuiper, Laurens (22 January 2025). "Query Engines: Gatekeepers of the Parquet File Format". DuckDB. Retrieved 8 May 2026.

[7] Braams, Boudewijn (December 2018). Predicate Pushdown in Parquet and Apache Spark (PDF) (Masters thesis). Universiteit van Amsterdam. Retrieved 8 May 2026.

[8] Yan, Cong; Lin, Yin; He, Yeye (20 June 2023). "Predicate Pushdown for Data Science Pipelines". Proc. ACM Manag. Data. 1 (2): 136:1–136:28. doi:10.1145/3589281. Retrieved 8 May 2026.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]