Tips for Writing Good Features for Alpha
Try to picture novel some situation in market that can be helpful in writing signal.
Different ways that will make my signal break.
Always be aware of all of the situations which can break signal.
Spread can be used for normalization. to tackle situation when difference between best price of sell and send best price is huge.. then it can help to reduce that impact.
More Predictive Features that can be used..
Differencing have huge predicting power.
Volume weighted Average price (trade_price) from trade_message
for Volume Weighted Average I can use square root of number of shares into price, or I can use ask_price * number of orders (number of people who ordered on same price)
Always remove the zero r score features because these will reduce the predicting performance of the alpha
Have different features that can handle different situation at the market for example if the difference between the first and second best sell price is greater. The formula/ feature that calculates the spread will also will predict the greater number(that my not be mean reverted) . You should have another feature that is normalized by spread so that values is mean reverted.
Try to adopt Different Market Scenario where you can use a feature to capture
Different behaviors as in aforementioned you can use spread to make different feature that will different behaviors.
Create a feature based on raw calculations
Create a feature that is normalized by spread to capture the bigger values that may occur when spread is greater for example ( Microprice — mid-price)/spread can capture these behaviors for you.
Write another feature that will only be triggered when value is above a certain threshold for example ((micro_price — mid_price ) > min_price_vlaue_delta) then ( Microprice — mid-price)/spread else 0
Or you can write another feature that will capture the behaviour will (micro_Price — mid_price) == min_price_delta.
Whenever writing a new feature, first draw a picture of that scenario that you want to capture with your feature. That will help you a lot when you will try to implement that feature.
When you have written a new alpha try to create different alphas that try to capture different behaviour of original alpha..for example first alpha is dealing with all type of volume(or just add messages) ,write another alpha that will deal on those message where volume > 50 and write another alpha that will deal with the message that have volume < 50. When you will run these alpha together if r score improves it means that you have splited on something that is important else if it decreases then you need to consider something else.