Causation or Correlation?
The World generates a lot of data, and there could be a lot of analysis showing how when a girl working in my office wears a pink dupatta the markets go up.
Or that when the number of pigs are increased, the pig iron production goes up.
There are too many rules – especially in the USA – like “sell in May and go Away” kind of stuff completely useless. Yes it is possible that the market may have gone down in June..and you could have saved money! However to use “data mining” – and try to find out the correlation. Even if you do find some co-relation, it does not mind that there is causation.
Almost everyday somebody offers me a trading strategy that will make me 24-35% pa by trading equity. Of course these are “Proprietary” . I do think most of them will not deliver, and those who do it will be because of luck. So for the people calling me, please note that I do not need this kinda returns. I have too much at stake to take such risks!!
People take past data and do what is now called “data mining”. DM is a great thing, but it is now in the hands of too many Marketing and Sales people, who use this word carelessly. Let us look at the past 10 years data from 2005 to 2014. People use this period and tell you that “If you invest Rs. 1 crore in a balanced fund with 65% equity and 35% in debt you can get Rs. 1L per month for the REST OF YOUR LIFE”. Obviously this back testing assumed that interest rates would be around 9% pa, equity was growing, multi-caps were doing well, FII money was flowing in, did not include transaction costs, trading costs, slack for keeping cash,….and Viola…here was a nice ‘back-tested’ model for you. Well, the purists may have frowned but this was good enough for the Marketing and Sales people. Good show.
This was a pathetically sold pension plan, and all the people reading this post know which fund house I am talking about. Amazing mis-selling. Then it was a Life insurance ladder…again a promised return of 6.30% pa – over 35 years. You want to believe this? Welcome, but you will not make money. That is a problem!
Another problem with data mining is that how will the fund manager himself behave when there is a panic. Look at the Credit Risk fund. I do think this is a great time to be investing in one, but there is a huge panic and people are withdrawing in hordes. A fund manager will ALSO PANIC…and start keeping more cash. This will cause slack in the portfolio, will it not? How many shareholders of Amc will be willing to pull out Rs. 500 crores to keep the unitholders happy?
If you are lucky enough to get to lay your hands on ‘Lies, Lies and Statistics’ you should read that book as well as Nerds on Wall Street. You will get many examples like the co-relation between pigs and pig iron. Or lightning and lightning bug. If you think these are humorous, please read Nerds….. You will surely enjoy it. It will also make you very skeptical of the word ‘back-testing’ and realize that it is useless, if not harmful. For example we all have got ‘data’ showing that if you stay invested for 25 years in the Indian share market you will not get a negative return. Great. Remember again this does not include entry load, exit load, taxation, and ‘reasonable’ management fee of 2%. Fair enough you say? Well this is for the INDEX. When it comes to investing the fund houses urge or nudge you to invest in “managed” funds with Large cap, mid cap and small cap. How will you back test this?
You said there is no entry load? Yes correct, but 25 years ago we had exit load. So how will you NOW back test even funds like Franklin India Bluechip or Hdfc Top 100?
How do you protect yourself? Simply by understanding the words ‘causation’ and co-relation. By understanding that knowing that the promoters of India bulls Housing, Edelweiss, MD of Indiainfoline, Md of Icici Lombard, Adag, …are all marathoners…you cannot draw a conclusion that their PE should be the same. However, people are far fetched in their ‘causation’ theory. They may believe it..
Be careful.
SS
Good one sir, Thank you. 🙂
SD
I love a sentence from Statistician George Box. He says “All models are wrong! but some are useful!”. It is important to focus on what is useful. As a marketing person, the given examples are useful. Not for an investor! So the point of focus has to be different.