|
Hi - I have a directed unweighted graph comprising roughly 10M nodes and maybe 10x that many edges. The nodes can have one of many tags, and I know that their "location" within this graph is an important predictive component of the tags. I'd like to find a set of features that describes each nodes "location", but it is unclear to me how to do this. I thought, e.g., the minimal distance to a fixed set of well-distributed nodes may work. Is there some common set of features for problems of this nature? |
Are the tags randomly assigned or are they connected in a relational way? E.g. Tag1-Tag2, and Tag1 and Tag2 are related topics.
Perhaps you can describe the problem in more detail, and then I can propose some heuristics or shortcuts for getting what you want.
Also, is it the case that the tags and edges are complementary? i.e. tags and edges are not sufficient to model the other? The alternative would be that you can predict the tags from the edges, and vice-versa.
This is an interesting problem and I can propose a solution, if you elaborate. It's also worth noting whether you want a rigorous solution that might be difficult to implement, or an ad-hoc one that is fast to implement.