Categories

Computer Science Personal Economics General Mathematics Linguistics Questions Teaching Physics Talks History Theology

Archive

Granger causality and Intrade data.

Granger causality is a technique for determining whether one time series can be used to forecast another; since the Intrade market provides time series data for political questions, we can look at whether political outcomes can be used to forecast other political outcomes.

There’s a library for the statistical package R to do the Granger test, and Intrade produces CSV market data. I fed the market data for various contracts since January 1, 2008 into R, and the output of that into GraphViz to make a nice-looking visualization; in particular, I connect $a$ to $b$ if $a$ Granger-causes $b$ with $p$-value less than 0.05. Darker arrows have smaller $p$-values. This is all an embarassing misuse of statistics and $p$-values, but it is quick and easy to do, and the results are fun to see.

Here is the graph for a lag of one day (i.e., does yesterday’s value of $a$ predict today’s value of $b$):

One day lagged granger causality graph

Here is the graph for a lag of two days (i.e., can the two previous days of data for $a$ be used to forecast the next day of data for $b$):

Two day lagged intrade granger causality graph

And here is the graph for a lag of three days:

Three day lagged intrade granger causliaty graph

Don’t take this too seriously. And one word of warning: an arrow from $a$ to $b$ does not mean that if $a$ is more likely, then $b$ is more likely—rather, it ought to mean that past knowledge of $a$ can be used to forecast $b$. I suppose it would be interesting to add some color for the direction of the relationship, and maybe I’ll do that when I have another free hour.

Kolmogorov complexity.

Here are some very ill-thought-out ideas on Kolmogorov complexity.

We define a metric on the space of bit-strings $\Sigma^\star$. For a universal Turing machine $T$, let $d_T(x,y)$ be the “length” of the shortest program that outputs $y$ on input $x$, or outputs $x$ on input $y$. This should measure how difficult it is to “relate” $x$ and $y$.

The ends of the metric space $(\Sigma^\star, d_T)$ should correspond to infinite random bitstrings, and because choosing a different univeral Turing machine just replaces this metric space with one quasi-isometric to it, the ends should be preserved, so there will still be a lot of infinite random bitstrings

But obviously I haven’t thought about any of this very carefully: for instance, the triangle inequality probably only holds coarsely, because it depends on being able to concatenate programs.

Here’s a similar question. Usually, we start with a partial function $f : \Sigma^\star \to \Sigma^\star$ which tells how to translate descriptions into objects; Kolmogorov complexity is then defined as $C_f(x) = \min_{f(y) = x} |y|$. Any universal Turing machine gives a measure of complexity with the same asymptotics, i.e., $C_g(x)$ and $C_f(x)$ differ by a constant that depends only on $f$ and $g$. Suppose I have another function $h$ so that $C_h$ has the same asymptotics: what more can be said about $h$?

There’s a stupid rigidity for computable functions (a computable function is still computable if its value is changed at finitely many places), and maybe these sort of questions could lead to a rigidity theorem for computability, a local Church-Turing thesis.

And having written this, I’m terrified at how similar I sound to Archimedes Plutonium. Now I’ll go to learn more about localization of spaces in the algebraic topology proseminar.