Let's say I have several 100M documents, which are very short (only a few words). There are several 1M terms in the vocabulary. What is the fastest way to find the top-k semantically related terms for each term in the vocabulary?
When I say fastest, I mean that it should take under a week of computation time, and as little human time as possible. So use of existing implementations is encouraged.