Would you please suggest me some relevant reading for learning the similar vector representation in space. Lets say i have two different domain and i embedded all the members of respective domain into a single domain. Now I would like to bring those vectors in same region of hyper-sphere, provided I already have some relationships(map) between those entites.Any learning algorithm for bringing those data points(vector representation) in same region?

asked Mar 16 '12 at 11:13

Kuri_kuri's gravatar image

Kuri_kuri
293273040

I don't understand your question. What is "domain" here? What do you mean by "bring those vectors in the same region"? Do you want to project them to the same region, or do you want to query for those which are in the same region as a query vector, or even updating those vectors such that those which share relationships end up closer?

(Mar 16 '12 at 11:42) Alexandre Passos ♦

@Alexandre , my point is updating those vectors such that those which share relationships end up closer.

(Mar 17 '12 at 01:55) Kuri_kuri

2 Answers:

If you just update the vectors to bring those that share relationships closer you can easily end up with all of them collapsed into a single point. To actually make this work you need some repulsive force pulling them apart.

One simple-to-implement repulsive force is defining an objective function which is a sum, for each vector, of lambda times the distance/cosine of that vector and all its neighbors minus gamma times the distance of that vector and the negated representation of each of its neighbors. If you update this sequentially for each vector you get an update which looks like a weighted mean of its neighbors with weight lambda and the negation of its non-neighbors with weight gamma, which you then normalize to keep everything in the unit sphere (this is important, otherwise having everyone at zero is a good solution). This constraint that everything should lie on the surface of an l2-ball, however, makes the problem non-convex, so you'll find it hard to solve it exactly. In general most formulations with repulsive forces as well as attractive forces are nonconvex when the attractive forces don't outweight the repulsive ones.

answered Mar 17 '12 at 08:40

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

edited Mar 17 '12 at 08:41

@Alexandre, using a repulsive force is a clever idea indeed, but I don't understand your suggestion. Perhaps you can clarify?

You suggest updating vectors with the sum of lambda times the mean of its neighbors and gamma times the mean of its "negated" neighbors. What do you mean by "negated" neighbors? And why would a function based only on the distances/cosines to a vector's neighbors contribute to pulling vectors apart? Wouldn't that just be a scaling of the vectors?

I would also be interested in any references you might have to formulations of repulsive forces in vector spaces.

(Apr 03 '12 at 04:07) muffin

Write down the following optimization problem. Minimize sum_i ( lambda (sum_n ||x_i - x_n||^2) - gamma (sum_o ||x_i - x_o||^2)) subject to ||x|| = 1. Now immagine all the relationships are symmetric, take the gradient w.r.t. x_i and set it to zero. Then you get that x_i = alpha sum_n x_n - beta sum_o x_o. Hence it is the sum of its neighbors plus the sum of its negated non-neighbors (or minus the sum of its non-neighbors). I was suggesting that you initatilized randomly and then did these updates followed by renormalizing.

(Apr 03 '12 at 07:21) Alexandre Passos ♦

You could look into Neighbourhood Components Analysis (NCA), which tries to find a transformation that optimises K-nearest neighbours performance in the resulting space. The authors propose a linear transformation, but technically you can use any transformation as long as it's differentiable.

The original formulation requires labeled data: the optimisation tries to bring members of the same class closer together, so that they are more likely to be 'chosen' when doing K-NN classification. However, for their second set of experiments, they also provide a formulation which works when you have pairs of examples that should be close together.

The 'repulsive force' Alexandre Passos talks about is also implicitly present here: by attempting to optimise K-NN performance, similar examples will be pulled together, but dissimilar examples will also be pushed further apart (as they would negatively affect performance).

answered Apr 03 '12 at 05:42

Sander%20Dieleman's gravatar image

Sander Dieleman
155672734

I just stumbled upon this survey paper from 2006 about distance metric learning: Distance Metric Learning: A Comprehensive Survey

Sections 2 and 3 are the most relevant for this problem.

(Apr 07 '12 at 10:33) Sander Dieleman
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.