Home
/
Market analysis
/
Historical data
/

Need to load historical dex trades for backtesting project

Need for Historical DEX Trades | Users Seek Efficient Backtesting Solutions

By

Lucas Fernandez

May 9, 2026, 06:46 AM

Edited By

Olivia Chen

3 minutes reading time

A person working on a laptop, analyzing DEX trades data while looking at charts on the screen, with notes and a coffee cup beside them.

A growing demand for historical decentralized exchange (DEX) trades is pushing developers to explore effective methods for backtesting in machine learning projects. Key players are facing challenges with extensive data loading into storage, creating a heated debate about the best practices and tools available.

Context and Challenges

An ongoing effort is underway to collect every trade from Uniswap, Curve, PancakeSwap, and Raydium since 2021 and load it into Snowflake. The volume is daunting, with some estimating that backfilling data from an Ethereum archive node could take several weeks. Existing solutions, like subgraphs, fail to offer the necessary fields, causing frustration among the community.

As one user pointed out, "The blockchain literally stores all the history. Just pull from RPC." This highlights the reliance on raw data extraction methods to access historical trades.

Key Discussion Points

The following themes emerged as users shared their experiences and suggestions:

  1. Data Extraction Complexity

    Many participants argue that relying on standard RPC queries for mass data retrieval can be time-consuming. It's clear that extracting data cleanly into a usable format is a priority.

  2. File Format Decisions

    There is an ongoing debate about whether to use Parquet or JSONL formats. Users have varied preferences, pointing out the importance of choosing based on how the data will be utilized. One comment noted, "How to store depends on how you're going to use it."

  3. Third-party Solutions

    Some users are curious about available vendors that provide historical data as columnar dumps directly to S3, avoiding the hassle of building an extraction layer themselves.

User Insights

The exchange among participants reflected a blend of optimism and apprehension:

"Thatโ€™s a lot of data to get from RPC. Will take a lot of time."

This sentiment captures the ongoing struggle for those seeking quick access to large datasets.

What is the most efficient way to handle such vast quantities of trades? With various opinions shared, the need for streamlined solutions became glaringly apparent, as users search for a more effective approach.

Key Points

  • Overwhelming Data Volume: Efficient extraction is critical, given the complexity of data from multiple DEX platforms.

  • Choosing Formats: The choice between Parquet and JSONL remains hotly contested among users.

  • Outsourcing Data: Interest in third-party vendors offering straightforward access to historical data is rising.

In this evolving context, the DEX community continues to push for solutions that minimize hassle and maximize efficiency in trading data retrieval.

Breaking Down the Road Ahead

Thereโ€™s a solid chance that advancements in data handling will emerge in the DEX space as developers respond to the pressing need for historical trade access. Given the current issues with data extraction, experts estimate around 70% of participants might pivot towards third-party solutions within the next year. This could significantly streamline processes and reduce the time it takes to retrieve massive datasets. As the competition heats up among DEX platforms, more robust solutions will likely gain traction, fostering a sense of urgency among developers to innovate or adapt quickly.

Echoes of the Past in Modern Times

A fitting historical parallel can be drawn to the early days of the internet when simple file transfer protocols battled for dominance. Just as some companies struggled with basic data transfers while others streamlined systems for efficiency, today's DEX community faces a similar crossroads. The push for efficiency in accessing historical trades mirrors that formative era, where those who could adapt to change thrived while others fell behind. By embracing collaboration and new approaches in data management, the crypto community can harness lessons from the past to navigate today's challenges.