Apache Parquet Java Vulnerability Enables Remote Code Execution via Avro Schema

Summary Security Advisory:

A high-severity remote code execution (RCE) has been identified in Apache Parquet Java, specifically within the parquet-avro module. Discovered by Apache contributor Gang Wu, this vulnerability affects all versions up to and including 1.15.1 and can allow attackers to execute arbitrary code when a system processes a specially crafted Parquet file. The issue is fixed in version 1.15.2.

OEM	Apache
Severity	High
CVSS Score	Not Available
CVEs	CVE-2025-46762
Actively Exploited	No
Exploited in Wild	No
Advisory Version	1.0

Overview

Apache Parquet is an open-source, columnar storage format designed for efficient data processing, widely used by big data platforms and organizations engaged in data engineering and analytics.

Vulnerability Name	CVE ID	Product Affected	Severity	Fixed Version
Remote Code Execution vulnerability	CVE-2025-46762	Apache Parquet Java	High	1.15.2

Technical Summary

CVE-2025-46762 arises from insecure schema parsing logic in the parquet-avro module of Apache Parquet Java. When the application uses the “specific” or “reflect” Avro data models to read a Parquet file, malicious actors can inject specially crafted metadata into the Avro schema portion of the file.

Upon deserialization, the system may inadvertently execute code from Java classes listed in the default trusted packages (e.g., java.util), resulting in remote code execution. The vulnerability is not present when using the safer “generic” Avro model.

CVE ID	System Affected	Vulnerability Details	Impact
CVE-2025-46762	Apache Parquet Java ≤1.15.1	Insecure deserialization in the parquet-avro module allows execution of arbitrary Java classes when processing Parquet files with embedded malicious Avro schemas. The issue is exploitable only when using the “specific” or “reflect” data models, and relies on the presence of pre-approved trusted packages like java.util.	Remote Code Execution (RCE), potential supply chain compromise, unauthorized code execution.

Conditions for Exploitation:

Applications must use parquet-avro to read Parquet files.

The Avro “specific” or “reflect” deserialization models are used (not “generic”).

Attacker-supplied or untrusted Parquet files are processed by the system.

This creates significant risk in data processing environments such as Apache Spark, Flink, and Hadoop, where external Parquet files are commonly ingested.

Remediation:

Upgrade to Apache Parquet Java version 1.15.2: This version addresses the vulnerability by tightening controls around trusted packages and blocking unsafe deserialization.

For users unable to upgrade immediately: apply the following JVM system property to disable trusted package deserialization:

-Dorg.apache.parquet.avro.SERIALIZABLE_PACKAGES=””

Conclusion:
CVE-2025-46762 presents a significant RCE threat within big data ecosystems that use Apache Parquet Java with the parquet-avro module. Systems relying on unsafe deserialization patterns are especially at risk. Prompt patching or configuration hardening is strongly recommended to safeguard against exploitation.

References:

https://lists.apache.org/thread/vr1h7dnr4jp2f1xhzzkwzcw49qgfgsyl

https://thecyberexpress.com/apache-parquet-java-flaw-cve-2025-46762/

Apache Parquet Java Vulnerability Enables Remote Code Execution via Avro Schema

Recent Posts

Recent Comments

Search

Recent Comments

Recent Posts

Archives