(Random Forest)
Random forest is an ensemble learning method for classification that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes of the individual trees.In random forests, each tree in the ensemble is built from a sample drawn with replacement from the training set. In addition, when splitting a node during the construction of the tree, the split that is chosen is no longer the best split among all features. Instead, the split that is picked is the best split among a random subset of the features. As a result of this randomness, the bias of the forest usually slightly increases (with respect to the bias of a single non-random tree) but, due to averaging, its variance also decreases, usually more than compensating for the increase in bias, hence yielding an overall better model. The preset modle information and parameters are discribed as below:
• n_estimatores: 600
The number of trees in the forest.
• criterion: gini
The function to measure the quality of a split. Supported criteria are “gini” for the Gini impurity and “entropy” for the information gain.
• max_depth: 10
The maximum depth of the tree.
• min_samples_split: 10
The minimum number of samples required to split an internal node:
Consider min_samples_split as the minimum numbe.
• min_samples_leaf: 1
The minimum number of samples required to be at a leaf node. A split point at any depth will only be considered if it leaves at least min_samples_leaf training samples in each of the left and right branches. This may have the effect of smoothing the model, especially in regression.
• min_weight_fracton_leaf: 0
The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node.
• max_features: auto
The number of features to consider when looking for the best split:
Auto, max_features=sqrt(n_features)
• oob_sore: TRUE
Whether to use out-of-bag samples to estimate the generalization accuracy.
• random_state: None
The random number generator is the RandomState instance used by np.random.
(AdaBoost)
AdaBoost, short for Adaptive Boosting, is a machine learning algorithm, formulated by Yoav Freund and Robert Schapire. It is a meta-algorithm, and can be used in conjunction with many other learning algorithms to improve their performance. AdaBoost is adaptive in the sense that subsequent classifiers built are tweaked in favor of those instances misclassified by previous classifiers. AdaBoost is sensitive to noisy data and outliers. Otherwise, it is less susceptible to the overfitting problem than most learning algorithms.The preset modle information and parameters are discribed as below:
• n_estimators:100
The maximum number of estimators at which boosting is terminated. In case of perfect fit, the learning procedure is stopped early.
• base_estimator:DecisionTreeClassifier(max_depth=1)
The base estimator from which the boosted ensemble is built.
• learning_rate:1
Learning rate shrinks the contribution of each classifier by learning_rate.
• algorithm:SAMME.R
SAMME.R real boosting algorithm.
• random_state:None
The random number generator is the RandomState instance used by np.random.