
Feb 2025 - Present
6min read
Solana Wallet Profitability Analyzer
Copy-trading in cryptocurrencies involves programmatically mirroring the trades of another promising trader, often combined with automated selling strategies. While there are tools that provide statistics on wallets, none offer the ability to retroactively simulate trading strategies on historical data to evaluate profitability.
This project focuses on building a system and data pipeline capable of parsing large volumes of transactions directly from the Solana blockchain in real time, storing processed trades in a database, and using that data to analyze wallet performance. On top of this, it simulates multiple strategy profiles on high-resolution historical data—allowing users to accurately assess whether a wallet is worth copy-trading and which strategy yields the highest return on investment.
How it works
The first stage of the pipeline focuses on collecting and processing raw data from the blockchain for later analysis. The system uses Solana RPC methods to fetch data both in real time and historically, extracting supported trades and other relevant information, which is then stored in a database.
The second stage analyzes the parsed trades of a specific trader to determine performance statistics. It calculates metrics such as profit, return on investment, win rate, and trading activity. This is similar to what most public and private analytics platforms offer, but these tools typically stop at basic statistics and do not provide deeper strategic analysis.
The third, and most advanced stage, involves simulating strategies on the trader's historical trades. The system processes all tokens the trader interacted with and reconstructs historical price charts using the trader's trade history. This enables simulations that apply strategy rules. These simulations allow for excluding trades based on conditions, accounting for price impact and slippage, account for delayed trades because of network congestion or competition, and apply automatic sell strategies to evaluate whether different strategies would have been profitable. The system supports simulating multiple strategies at scale.
Architecture
The system is built as a data pipeline designed for high-throughput blockchain ingestion, storage, and analysis. It separates real-time data collection, parsing, storage, and simulation into independent components to maintain scalability and prevent bottlenecks as volume grows.
Version 1 was a Python-based prototype that sourced transaction data through a third-party aggregator's private API. While functional, this approach introduced dependency risks and resulted in slow fetching times of 30-60 minutes per wallet. Version 2 evolved into a fully on-chain parser, still in Python, directly ingesting transactions from the Solana blockchain and storing them for reuse. These changes reduced data retrieval times to under 10 seconds.
A future Version 3 is planned to focus on performance and memory efficiency by rewriting the computational expensive simulating phase in a compiled language (currently considering Rust). This upgrade is motivated by the increasing data volume processed by the pipeline and the need for faster simulations at scale. Furthermore, Version 3 is aimed to support EVM-based blockchains as well
Results & Takeaways
The system proves highly effective at extracting and transforming raw blockchain data into meaningful trading insights. It enables fast performance evaluation of wallets, and more importantly, makes it possible to retroactively simulate trading strategies with high accuracy, providing insights that current analytics tools do not offer.
Key learnings from this project include:
Data engineering on blockchain data: Building a high-throughput pipeline capable of parsing, normalizing, and storing large volumes of on-chain transactions in a storage and performance friendly manner.
System evolution & architecture design: Iteratively improving the codebase across versions by removing external dependencies and moving toward fully on-chain parsing for speed and reliability.
Realistic strategy simulation: Implementing accurate historical replays that accounted for slippage, price impact, competition, network delays, and conditional trading logic.
Performance optimization: Identifying Python bottlenecks and the need for a compiled language (Rust) for the next version to support scaling and faster simulations.
Overall, this project taught me how to design robust data pipelines for blockchain analytics, model realistic market conditions for strategy simulation, and evolve systems as performance requirements grow. It showed the importance of architectural foresight, especially when early design decisions must scale to hundreds of millions of transactions.
