Oddbean new post about | logout
 Imagine you have a unit vector that points in any direction.  In 3-dimensional space, it represents some point on the unit-sphere.  That can be described with 3 numbers (x, y, z) but not ANY three numbers, they have to be such that the magnitude is 1.

In any case, if you can map information to a point on this unit sphere, and you do that for lots of input data, then when you query the system with new input data it can tell you which pre-existing input data happens to be the closest point on this unit sphere.  Actually the most popular algorithms aren't guaranteed to be the closest (but I know of one that does give the closest and has other good properties but I'm under NDA on that so I can't say more).

3-dimensions turns out to be pretty useless, but in say about 3096 dimensions you start being able to encode enough information into that 3096-D unit-vector as to be useful in an A.I. sense.

But you have to first map information into a unit vector using an "embedding layer" which is some A.I. magic that I don't know very much about at all.