Data distribution-dependent
Data distribution-independent
Assumes only realizability
By assuming realizability, we only look at bounds on generalization error rather than generalization gap
Stronger assumptions/dependence on data distribution
Data-dependent
Data-independent
Data-dependent
Data-independent
Data-dependent
Data-independent
Bounds on generalization gap
Bounds on the generalization error
Bound on generalization gap
Bound on generalization error
Not possible to have non-trivial bounds due to No Free Lunch theorem
Realizable data-dependent bounds
Optimality of bound depends on assumptions on data distribution..
Realizable VC dimension bound is optimal
Sometimes is vacuous
Bounds on generalization gap
Bounds on generalization error
Bounds on generalization gap
Bounds on generalization error
Data-dependent Bound on generalization gap of function found by algorithm
Data-dependent uniform convergence generalization gap bound
Rademacher complexity bound is sometimes optimal?
For some data distributions
Do they fail for some data distributions
Data-dependent Non-uniform (SRM-like) generalization gap bound
For some data distributions
Do they fail for some data distributions
Assuming realizability (possible for fully expressive models - e.g. some non-parametric models)
Realizable data-dependent bounds
Agnostic VC dimension bound is optimal
Sometimes is vacuous
Data-dependent Bound on generalization gap of function found by algorithm with assumptions on data distribution
Data-dependent uniform convergence generalization gap bound with assumptions on data distribution
Rademacher complexity bound is sometimes optimal?
For some data distributions
Do they fail for some data distributions
Data-dependent Non-uniform (SRM-like) generalization gap bound with assumptions on data distribution
For some data distributions
Do they fail for some data distributions
Assuming realizability
Realizable data-dependent bounds with extra assumptions on data distribution
Always possible in principle. But do particular approaches fail for some distributions
Other assumptions
Bound on generalization gap of function found by algorithm
Uniform convergence generalization gap bound
Rademacher complexity bound is sometimes optimal?
For some data distributions
For some data distributions (Nagarajan et al. https://papers.nips.cc/paper/9336-uniform-convergence-may-be-unable-to-explain-generalization-in-deep-learning)
Non-uniform (SRM-like) generalization gap bound
For some data distributions
For some data distributions (Nagarajan et al. https://papers.nips.cc/paper/9336-uniform-convergence-may-be-unable-to-explain-generalization-in-deep-learning)
Assuming realizability
Realizable data-independent bounds with extra assumptions on data distribution
Always possible in principle. But do particular approaches fail for some distributions
Only example I can think of right now are some realizable margin bounds I’ve seen.
Other assumptions