In summary, nominal variables are used to “name,” or label a series of values. Ordinal scales provide good information about the order of choices, such as in a customer satisfaction survey. Interval scales give us the order of values + the ability to quantify the difference between each one. Finally, Ratio scales give us the ultimate–order, interval values, plus the ability to calculate ratios since a “true zero” can be defined.
https://www.mymarketresearchmethods.com/types-of-data-nominal-ordinal-interval-ratio/
Now that we have know about
People realized that the very successful Boosting method was in essence
Boosting = a very general meta-algorithm for optimization of the mapping function from input variables to output target variables.
This algorithm chooses multiple weak functions that are combined together, just as the ensemble of decision trees are for Random Forests.
This idea can then be generalized so that each new weak learner is explicitely treated as a function that points directly away from the gradient of the current combined function.
Given some tree based ensemble model then, represented as a function
In practice we need to be satisfied with merely approaching this perfect update by fitting a functional gradient descent approach where we use an approximation of the true residual (also called the loss function) each step.
In our case this approximation is simply the sum of the wrong answers (i.e. the residuals) from each weak learner decision tree
Gradient Tree Boosting explicitly uses the gradient
of the loss function of each tree to fit a new tree
and then add it to the ensemble.
There is also further optimization of weighting functions for each tree and various regularization methods which can be done.
The popular algorithm XGBoost\cite{xgboost} implements approach.