Apache Parquet Java Vulnerability Enables Remote Code Execution via Avro Schema 

Summary Security Advisory:

A high-severity remote code execution (RCE) has been identified in Apache Parquet Java, specifically within the parquet-avro module. Discovered by Apache contributor Gang Wu, this vulnerability affects all versions up to and including 1.15.1 and can allow attackers to execute arbitrary code when a system processes a specially crafted Parquet file. The issue is fixed in version 1.15.2. 

OEM Apache 
Severity High 
CVSS Score Not Available 
CVEs CVE-2025-46762 
Actively Exploited No 
Exploited in Wild No 
Advisory Version 1.0 

Overview 

Apache Parquet is an open-source, columnar storage format designed for efficient data processing, widely used by big data platforms and organizations engaged in data engineering and analytics.

Vulnerability Name CVE ID Product Affected Severity Fixed Version 
Remote Code Execution vulnerability  CVE-2025-46762 Apache Parquet Java  High  1.15.2 

Technical Summary 

CVE-2025-46762 arises from insecure schema parsing logic in the parquet-avro module of Apache Parquet Java. When the application uses the “specific” or “reflect” Avro data models to read a Parquet file, malicious actors can inject specially crafted metadata into the Avro schema portion of the file.

Upon deserialization, the system may inadvertently execute code from Java classes listed in the default trusted packages (e.g., java.util), resulting in remote code execution. The vulnerability is not present when using the safer “generic” Avro model. 

CVE ID System Affected Vulnerability Details Impact 
  CVE-2025-46762  Apache Parquet Java ≤1.15.1 Insecure deserialization in the parquet-avro module allows execution of arbitrary Java classes when processing Parquet files with embedded malicious Avro schemas. The issue is exploitable only when using the “specific” or “reflect” data models, and relies on the presence of pre-approved trusted packages like java.util.  Remote Code Execution (RCE), potential supply chain compromise, unauthorized code execution. 

Conditions for Exploitation: 

  • Applications must use parquet-avro to read Parquet files. 
  • The Avro “specific” or “reflect” deserialization models are used (not “generic”). 
  • Attacker-supplied or untrusted Parquet files are processed by the system. 

This creates significant risk in data processing environments such as Apache Spark, Flink, and Hadoop, where external Parquet files are commonly ingested. 

Remediation

  • Upgrade to Apache Parquet Java version 1.15.2: This version addresses the vulnerability by tightening controls around trusted packages and blocking unsafe deserialization. 
  • For users unable to upgrade immediately: apply the following JVM system property to disable trusted package deserialization: 

-Dorg.apache.parquet.avro.SERIALIZABLE_PACKAGES=”” 

Conclusion: 
CVE-2025-46762 presents a significant RCE threat within big data ecosystems that use Apache Parquet Java with the parquet-avro module. Systems relying on unsafe deserialization patterns are especially at risk. Prompt patching or configuration hardening is strongly recommended to safeguard against exploitation. 

References

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top