Labeling Method
How we create the Label (target prediction)
Last updated
How we create the Label (target prediction)
Last updated
The primary and fundamental component of the label is, without a doubt, the fluctuation of price, specifically the increase or decrease. This aspect is straightforward and easily comprehensible.
The second aspect is the need to minimize the trading risk involved, as there is always a possibility of deviating significantly from our anticipated forecast. Managing risk is crucial in order to mitigate potential losses.
We provide the following formula to calculate the label:
where:
r is the return rate of the price.
Pbm represents the probability of the price occurring at time t+n, which is estimated at time t. We employ Brownian Motion as the method to estimate this value.
V represents the price volatility change, which is calculated by dividing the next short-term standard deviation by the previous long-term standard deviation. This measurement helps assess the fluctuations in price over time.
y is the label.
For a clearer understanding, you can refer to the following link where you will find additional information and details.
In this tournament, we are establishing the value of n as 1, designating the round as the unit, wherein each round spans a duration of 12 hours.
Keep in mind that, for this type of input submission, you are required to provide three numerical values: the minimum price, the middle price (which represents your anticipated price), and the maximum price. The range between the minimum and maximum prices constitutes your expected price range. Pbm and V will be calculated as following:
Pbm will be determined using past prices, indicating that once the price hits the round time, Pbm will undergo recalculation. This process ensures a continuous update of Pbm at each round.
V will be calculated by taking the following formula:
where the first multiplication term is employed for computing the standard deviation of the triangular distribution, while the subsequent term involves the reciprocal of the preceding long-term standard deviation. The reason we choose the triangular distribution is due to its simplicity in simulating price fluctuations centered around the median price, which holds the highest probability.