|
The answer to this problem may be simpler than what I am thinking but here goes. If you have a data set with S students and Q questions for learning IRT parameters. All students don't answer all Questions (this is important). I need to partition my data into two groups of students such that there is max# Questions attempted by students in both the sets. How do you do this? So I am thinking that there is either a graph optimization problem. So for example I can think of this problem as a bipartite graph with Q and S being the two sets of nodes and arrows between Q and S denoting if a student did a question. And then the S would be partitioned into S1 and S2 such that the flow from S1 to S2 through Q would be maximum. Probably too complicated? Maybe a simpler way exists. Thoughts? |