ID3: C++ Implementation, Forex (Foreign Exchange) Forecasting, First Experiments (Part 3)


Ok, so now to the results of ID3 run on Forex timeseries.

In the Forex world, many-a-trader dream of the holy grail – an algorithm or some other statistically robust technique to discover the moments when the exchange rate will turn around. This will allow them to close/open new transactions which will result in profit ($$$$). In addition to exchange rate itself, there is also a transactional volume. Forex people like to look at the exchange rate charts which have so called bars. It’s like a box-whisker plot but simpler: for a group of ‘ticks’ (exchange rate time points) you take the beginning and ending values and lowest and highest value (open, close, low, high respectively). You also add up the volume to get the volume for this time interval. The length of interval forms the scope, or timeframe of your chart. People like to group by 5 minutes, 10, 15, 30, 1 hour and 4 hours.

For the purposes of this analysis, I downloaded highest granularity data (1 minute) and it seemed that every time there is an upcoming ‘trend’, i.e. some visually convincing change in the direction of the exchange rate, there is a preceding ‘activity’ in the volume. So, the idea is, using volume timeseries, you can somehow figure out if you’re about to witness ‘trend’. The natural question is, what is ‘trend’? Traders know the trend by looking at the exchange rate chart, but mathematically, it’s a bit tricky to define. For example, I might say that if I take a look at the exchange rate value at the current point in time, and compare it with the value 10 time units ahead – I will get the trend value. But looking at the chart, it will become apparent that it’s not necessarily the case. So I thought to look at the next 10 time points ahead from the current point and see how many of them are higher (or lower) than the current exchange rate value. If 80% of them are lower, then there is downward trend (-1), if 80% higher – it’s an upward trend (1), otherwise no trend (0). The point being that there is no real agreement on mathematical definition of the trend, so identification of it becomes somewhat of an art.

My data contained about 65,536 rows (Alpari’s Metatrader exported that many) for Euro/US Dollar exchange rate (colloquially known as EURUSD) on 1 minute interval. I only looked at closing price and volume which were in a text file “timeseries.txt” (see source code).

In Forex, traders often look at moving averages of exchange rates and use all kinds of ‘indicators’: essentially functions which process historical data and spit out clues about trend dynamics. Many of these indicators are based on moving averages of different periods and the gist of them comes down to this: look at the moving average ma1 and the current period and look at ma2 and compare them (each average has its own period.) If ma1 is higher than ma2 then the exchange rate will go up, if it’s lower then it will go down. Or they’ll get into some other heuristic, like if the value of a particular stochastic is less than -20, it’s the downward trend that’s approaching, if it’s higher than 80, it’s an upward trend, otherwise no trend.

I took a set of period values (10, 20, 30) and created moving average for each one. Then, at every time point, I compared these moving averages pairwise and made attribute for each pair. I did the same for volume timeseries. I then added another attribute for lagging trend – if there was a non-zero trend value in the past 5 timepoints, then that value is copied here, otherwise no.

Having thus created my feature vector, I loaded it to the algorithm (as per previous post in this rubric) and waited for my results. They came back in a form of text file printout of the attribute-value classification tree (the result of ID3) which showed accuracy of trend prediction based on attribute values.

Needless to say, so far it doesn’t seem like I’m close to the holy grail.

But I’m seeing something.

The first attribute which ID3 decided should be the root of the classification tree is L (lagging trend.) This means that whether there was a trend in the past 5 timepoints or not is a good way to predict if a particular point is a start of new trend. Roughly 58% accuracy when looking at 15,842 examples from the training dataset which have a value of L as -1 (i.e. there was a downward trend in the past 5 timepoints.) But then interesting things start to happen: accuracy of prediction on negative trend improves to 60% by conjunction of more criteria. The picture looks like this:

0.666 (66.6%) probability that we’re about to see downward trend if current datapoint has:

lagging trend value is -1 and

exchange rate moving average 10 is less than exchange rate moving average 20 and

volume moving average 20 is less than volume moving average 30 and

volume moving average 10 is higher than volume moving average 30 and

exchange rate moving average 20 is higher than exchange rate moving average 30

The same criteria yield downward trend possibility of 60.3% on validation part of the dataset. These resutls are somewhat counterintuitive because some moving averages have to be higher than others (like volume moving average 10 has to be higher than volume moving average 30). Perhaps it has to do with the market activity picking up right before trend changes direction.

60% accuracy of trend prediction is nothing to be bragging about – the more important accomplishment is having created necessary source code, object model and programmatic control. These experiments can be easily expanded to include more moving averages (simply add another row to the periods vector in the code) and other features. Like Thomas Edison, you can keep varying your hypotheses until you find the right one (so every failure is actually a success – an elimination point). ID3 doesn’t care about the meaning of attribute/value pairs – it simply looks if there are any slices of data which show high accuracy of classifier prediction.

I’m currently reading Ross Quinlan’s C4.5 but not sure if addition of features such as branch pruning will result in substantive performance improvement (as far as prediction accuracy is concerned.)

Below is the snippet from algorithm output text file:

Attribute name: L
Value: -1 // i.e. when value in column L = -1 then
Training dataset   -1 0.580167 9191 15842 // 15842 is the count of records with L = -1, 9191 is the count of those records with L= -1 and classifier = -1
Validation dataset -1 0.585835 9347 15955 // same as above for the second part of the dataset
_____Attribute name: 0_2
_____Value: -1
_____Training dataset   -1 0.563896 4192 7434
_____Validation dataset -1 0.566937 4201 7410
__________Attribute name: v1_2 // v stands for volume, i.e. volume moving average with period 1 compared with volume moving average with period 2, where 1 and 2 are indices, not actual periods used to construct the moving average value
__________Value: -1
__________Training dataset   -1 0.545642 2122 3889
__________Validation dataset -1 0.566546 2188 3862
_______________Attribute name: v0_2
_______________Value: -1
_______________Training dataset   -1 0.541397 1589 2935
_______________Validation dataset -1 0.56896 1679 2951
____________________Attribute name: v0_1
____________________Value: -1
____________________Training dataset   -1 0.559839 973 1738
____________________Validation dataset -1 0.573504 987 1721
____________________Value: 0
____________________Training dataset   -1 0.8 12 15
____________________Validation dataset -1 0.47619 10 21
_________________________Attribute name: 0_1
_________________________Value: -1
_________________________Training dataset   -1 0.727273 8 11
_________________________Validation dataset -1 0.444444 8 18
____________________Value: 1
____________________Training dataset   -1 0.510998 604 1182
____________________Validation dataset -1 0.564103 682 1209
_________________________Attribute name: 1_2
_________________________Value: -1
_________________________Training dataset   -1 0.510682 502 983
_________________________Validation dataset -1 0.550562 539 979
______________________________Attribute name: 0_1
______________________________Value: -1
______________________________Training dataset   -1 0.508115 407 801
______________________________Validation dataset -1 0.548348 448 817
______________________________Value: 0
______________________________Training dataset   0 0.666667 2 3
______________________________Validation dataset  0 0 0
______________________________Value: 1
______________________________Training dataset   -1 0.52514 94 179
______________________________Validation dataset -1 0.561728 91 162
_________________________Value: 0
_________________________Training dataset    0 0 0
_________________________Validation dataset  0 0 0
_________________________Value: 1
_________________________Training dataset   -1 0.512563 102 199
_________________________Validation dataset -1 0.621739 143 230
_______________Value: 0
_______________Training dataset   -1 0.461538 6 13
_______________Validation dataset -1 0.75 9 12
_______________Value: 1
_______________Training dataset   -1 0.560043 527 941
_______________Validation dataset -1 0.556174 500 899
____________________Attribute name: 1_2
____________________Value: 0
____________________Training dataset    0 0 0
____________________Validation dataset  0 0 0
____________________Value: 1
____________________Training dataset   -1 0.666667 92 138
____________________Validation dataset -1 0.603306 73 121

And here is the full file: forex_id3_classifiction_tree_trend_predictor.txt

In Thomas Bass’ “The Predictors”, there is an allusion to a similar process used by Norman Packard. Attached is the publication directly relevant to this investigation as well as some additional ideas.

The Predictors Cover

Genetic Learning Algorithm for Analysis of Complex Data – Norman Packard

Norman Packard


~ by Monsi.Terdex on March 21, 2013.

5 Responses to “ID3: C++ Implementation, Forex (Foreign Exchange) Forecasting, First Experiments (Part 3)”

  1. Hi! Thanks for the very informative series of articles about ID3! Would you be kind to provide the timeseries.txt? I can’t find it anywhere in your articles?


  2. Never tried trees on forex; I’d be surprised if it works. There are often issues with “free” data where there is artificial autocorrelation in the averaging process.
    FYI, the original author of ID3 has something called C5, which is a lot more flexible (and which can use boosting), and you can download it in good old C here:

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Normal Boy

Nothing out of the ordinary

Data Engineering Blog

Compare different philosophies, approaches and tools for Analytics.

%d bloggers like this: