CS 224W Project Proposal
Notes and Thoughts:
* modelling contagions as being conditionally independent, loosing some info that way - Instead why not have two sets of contagions; one is like a max heap where the ones that have had most influence over you exist and the other set models the recency effect where contagions you have interacted with most recently have a bigger weight. * a highly infectious URL can help a less infectious but highly correlated url go viral while it can suppress a slightly less infectious but unrelated URL.
*Consider probability of a user adopting a piece of content based on what they have seen before * Estimate the prob of a user being infected by a contagion given the sequence of contagions she has been exposed to so far. * Can we make an improvement in the model by modeling the dependence of the contagions. We are saying X is conditioned on Y1 to Yk but a time step ago Y1 was X. Pre computing or some sort of sampling can help us find the joint probability distribution rather than assuming independence. This seems to be a standard hand waving that is going on *Contagion interaction function models the effect contagion uj has on our current contagion ui when ui was exposed to uj k exposures ago. *A way to model this will be in a k different WXW matrices where rows represent different possible ui's and columns uj. * So the best way to do this is to identify classes or clusters to which each contagion belongs and model the interactions between these classes. *A way to set up this class is via a Membership matrix where which is of size W X T where W is number of contagions and T is the number of clusters(classes) , here M(i,j) is the probability that contagion ui is a member of cluster cj.
This new matrix gives rise to a new interaction function. Instead of modelling the interaction between two contagions we are now modeling the
interaction between two clusters scaled by the probablities.
* Therefore now to find the interaction between any two clusters: For any contagions ui and uj we find the probability that ui belongs to any class/cluster t and for any such case we find the probability that uj belongs to any class/cluster s and inside a double summation we multiply the above two probabilities with the interaction function for these two clusters s and t.
Didn’t understand how their clustering matrix is a low rank
95% for training and only 5% for testing ?isn’t that kinda bad. You could overfit the model and since your test and cross-validation are so small it will likely not matter.
ican belong to
A and Bthe number of overlapping nodes is
Adjacent Cliquesare ones which share atleast k-1 nodes.
Persistenceboth involved in adoption of hashtag. Stickiness could be considered the prior probability of infection while Persistence is can be quantified by the marginal benefit that is achieved from repeated exposure to the same hashtag.
@twitter-username. This indicates X is paying attention to Y, something that is not obvious from the follows relationship.
Xwe call the set of nodes that
Xhas an edge to (under the above specification) as the
neighbour set = N(X)of
CS 229 Project Readings