π¦ΎAI market making
How does Reform leverage AI?
Last updated
How does Reform leverage AI?
Last updated
In the evolving world of cryptocurrency and market making, artificial intelligence (AI) has appeared as an emerging trend, offering opportunities for innovation and efficiency.
At the forefront of this revolution is Reform DAO, a market making DAO that harnesses the capabilities of AI to redefine the industry.
This report dives into the mechanisms through which Reform utilizes AI technologies to offer predictive insights and advantages in crypto trading. By implementing an in-depth exploration of classification, regression, and various machine learning models, we will elaborate on these techniques that enable Reform to forecast market trends with precision. In addition, we will also examine our methods to use AI to train algorithms to navigate the complexities of the crypto market.
Through this report, we aim to provide a comprehensive understanding of Reformβs AI utilization, setting the standard for AI integration in the industry and beyond.
Reform DAO uses Data Science software utilizing Artificial Intelligence / Machine Learning / Neural Network to predict variability across key metrics.
It provides automatic classification and regression predictions in the crypto market at different time scales (seconds, minutes, hours, days). More specifically, it uses its in-house developed innovative AI voting algorithm to automatically label the predicted prices in client-defined classes (Fibonacci, grid, or custom methods) to indicate the level of increase/ decrease of cryptocurrencies. In addition, Reform utilizes the extracted predictions to suggest cryptocurrency conversion to maximize return.
With past data obtained from different sources, the algorithms are dynamically trained and their performance is evaluated. Then they are applied to real-time data to predict the level of change per price of a crypto token.
Crypto prices can be predicted (in normal or anomalous behavior) using a suite of ML models, namely linear, and non-linear with moving, sliding, or fixed windows. The selection of the appropriate model setup is case-dependent. Possible results are shown in fully tailored dashboarding with easy-to-read charts, live graphs, and simulations.
Supervised classification stands out as a powerful tool for predictive analytics, particularly when forecasting the prices of selected coins over a specified timeframe.
At the core of this approach is the training phase, where a collection of diverse machine learning models is fed past data, allowing them to discern patterns and associations that are crucial for making accurate predictions. This process involves assigning the relevant price range labels to the given times of the chosen cryptocurrencies, effectively setting a foundation for the model to understand the dynamics at play.
Once the initial training phase is completed, the modelsβ robustness is put to the test through a rigorous evaluation process, which involves both previously encountered data and new, unseen datasets.
This crucial step ensures that the models are not only adequate and successful at recalling past scenarios but are also adaptive to new situations, thereby enhancing their predictive capabilities. This empowers Reform to make informed decisions, leveraging the insights gathered from the modelsβ analysis to navigate the complexities of cryptocurrency management with greater confidence and strategic acumen.
The regression analysis is used to understand the relationship between different independent variables like the price of selected cryptocurrencies and a dependent variable or outcome. Supervised regression models are a cornerstone technique for the prediction of continuous values, understanding trends, and making forecasts.
Unlike classification, where the focus is on predicting discrete price range labels, regression aims to predict actual cryptocurrency price fluctuations. The training phase of a regression model involves feeding the system with a comprehensive dataset from the past. This process allows the model to intricately learn the underlying relationships between input variables and the continuous target variable, thereby equipping the model with the ability to make accurate predictions based on new inputs.
Following the training phase, the efficacy of the regression model is thoroughly assessed through a validation process, which tests the model's performance against both known past data and new, unseen data to ensure its predictive reliability and robustness. This step is crucial for confirming that the model can generalize well and is not overfitted to the training dataset. Upon successful validation, the regression model is then capable of predicting upcoming values, such as the future performance of a coin or trend over a given time frame.
This predictive capability is instrumental for Reform algorithms, enabling them to make informed decisions on exchanges between different cryptocurrencies.
The correlation matrix is another feature to inspect the dataset. It provides the correlation coefficient between variables (in the whole range or part of the dataset) that is then presented in the form of a heatmap matrix.
Agglomerative clustering is used to obtain groups of variables with similar high correlation values. The added value is twofold: Firstly, the analyst can screen easily the different behaviours of the cryptocurrencies by inspecting one per cluster and secondly, by changing the time period, they can comprehend the impact of the launch of a new cryptocurrency in the correlation of the existing ones. Dual heatmaps are provided (are coins correlated or anti-correlated?), one with clustered cryptocurrencies and one with fixed order to support different requests.
Unlike traditional predictive models that merely capture correlations, causality-focused ML models strive to reveal the underlying mechanisms that drive relationships between variables, offering a deeper and more actionable level of insight.
By employing advanced algorithms to detect and account for confoundersβvariables that can misleadingly suggest or obscure true causal relationshipsβthese models can isolate genuine causal effects, thereby providing a more robust foundation for decision-making. This approach is not just about predicting outcomes but about understanding the 'why' and 'how' of phenomena, enabling the operators of algorithms to implement strategies that are informed by a nuanced comprehension of causality, leading to more effective interventions and policies that target the root causes rather than just the surface correlations. A Confounder is an extraneous variable whose presence affects the variables being studied so that the results do not reflect the actual relationship between the variables under study. Reform uses an innovative ML approach to search for confounders.
Reform uses a suite of different approaches for fault/anomaly detection that can be used appropriately in a complementary (hybrid) manner to build case-specific rules. The first approach (faults) offers results based on the entire set of variables and records and automatically calculates thresholds that classify whether a record is a fault or not without user intervention. In addition, single variable anomalies are detected based on (linear, non-linear, and ensemble model) predictions of what the upcoming value should be. If the measurement differs unexpectedly from the prediction, then this measurement is flagged as an "anomaly".
The identification of anomalies is acknowledged as an important structural element of the subsequent analysis, thus Reform has developed a novel meta-regressor, namely Regressor Voting, where the decision of whether a price prediction or fair price point is an anomaly or not is the outcome of a weighted voting procedure from different rival regressor models. This approach increases the confidence in the results predicted.
Anomaly for a token: when the difference between actual and predicted is more than 3*SD for the majority of ML models.
Reform uses a novel precursor event forecasting model based on data mining. This uses data from a period before identified events as training and predicts the system's behavior in the future. This is presented using a unitless signal that increases substantially before an upcoming event of the selected cryptocurrency. This can work in a fully unsupervised manner (no validated failure data are provided). In case previous events are given, machine learning algorithms are used to predict future events using classification ML approaches.
Reform uses a novel price prediction forecasting model based on data mining for their algorithms. This uses data from a period before identification (actual coin and trading/candle data) as training and predicts the price behavior in the future.
The F-score or F-measure is a measure of predictive performance. It is calculated from the precision and recall of the test, where the precision and the recall are factors based on correct predictions vs properly identified true instances. Precision is also known as positive predictive value, and recall is also known as sensitivity in diagnostic binary classification.
The F1 score is the harmonic mean of the precision and recall. It thus symmetrically represents both precision and recall in one metric. The more generic F Ξ² F_{\beta } score applies additional weights, valuing one of precision or recall more than the other.
The highest possible value of an F-score is 1.0, indicating perfect precision and recall, and the lowest possible value is 0, if either precision or recall is zero. The F-score is used to check the accuracy of the predictions by AI.
Information that can be used to predict the fair price (predicted mid-price), could be obtained from the order book and trades of the coin/token.
Predicting mid-price, spread, and volatility from the limit order book bids and ask orders using machine learning involves several steps.
First, the limit order book data needs to be pre-processed, which typically includes cleaning, normalizing, and feature engineering. Features could include quantities that will be acquired from the order book stream dataset.
Next, machine learning models can be trained on historical data to predict the mid-price (the average of the best bid and ask), spread (the difference between the best bid and ask), and volatility (the standard deviation of mid-price movements over a given period). It's crucial to split the data into training and testing sets and use appropriate evaluation metrics to assess model performance. The importance of the unseen data will be carefully studied to perform unbiased predictions. Additionally, techniques like cross-validation can help optimize model accuracy.
Finally, the trained model can be deployed to make real-time predictions on new limit order book data, with periodic retraining to adapt to changing market conditions.
In addition to the above, determining precursors of spread involves identifying factors or features in the limit order book data that precede changes in spread. Again, machine learning models can be trained to analyse historical data and identify patterns or relationships between these precursors and spread movements. Feature importance techniques can help ascertain which features are most influential in predicting spread changes. Additionally, time-series analysis techniques can capture temporal dependencies and dynamics in the data. By leveraging these precursors, machine learning models can provide insights into the drivers of spread dynamics and improve predictive accuracy.
Our business model as a market maker is to profit from bid-ask spread. Reform provides prices and sizes at which other participants can buy and sell. By doing so, Reform will trade and profit if it completes both a buy and a selling trade. The profit will be from the spread between the bid and ask price that the algorithm provided. On the other hand, Reform will be given a position, and holding such a position bears the risk. After only completing the buy or sell side of a trade, the position could move against the market maker resulting in a loss.
For that reason, the new algorithms aim to buy below a fair price and sell above that fair price, whatever that fair price may be. In this a good fair price calculation/prediction is essential. The ability to predict the next mid-price (or an alternative measure for the price) allows the algorithm to modify its prices in time and be more likely to end up with a position that does not move against us.
All of these AI-generated datasets offer the algorithm the insights to be more effective. The machine learning results are developed in a tailored format and transforms all conducted analytics are sent to the algorithms to be executed.
TBA