Word clustering using the Exchange Algorithm

It is often useful to cluster words together (for example, when building machine translation systems). mkcls is often used for this task, but it is sequential and only runs on a single machine.


This project will build a distributed version of mkcls (which uses the Exchange Algorithm). Fragments of such an approach already exists online, along with a description of the Exchange Algorithm itself.


Bilingual (parallel) data is available.