xgbse._meta.XGBSEBootstrapEstimator¶
Bootstrap meta-estimator for XGBSE models:
- allows for confidence interval estimation for
XGBSEDebiasedBCE
andXGBSEStackedWeibull
- provides variance stabilization for all models, specially for
XGBSEKaplanTree
Performs simple bootstrap with sample size equal to training set size.
__init__(self, base_estimator, n_estimators=10, random_state=42)
special
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
base_estimator |
XGBSEBaseEstimator |
Base estimator for bootstrap procedure |
required |
n_estimators |
int |
Number of estimators to fit in bootstrap procedure |
10 |
random_state |
int |
Random state for resampling function |
42 |
fit(self, X, y, **kwargs)
¶
Fit several (base) estimators and store them.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
[pd.DataFrame, np.array] |
Features to be used while fitting XGBoost model |
required |
y |
structured array(numpy.bool_, numpy.number |
Binary event indicator as first field, and time of event or time of censoring as second field. |
required |
**kwargs |
Keyword arguments to be passed to .fit() method of base_estimator |
{} |
Returns:
Type | Description |
---|---|
XGBSEBootstrapEstimator |
Trained instance of XGBSEBootstrapEstimator |
predict(self, X, return_ci=False, ci_width=0.683, return_interval_probs=False)
¶
Predicts survival as given by the base estimator. A survival function, its upper and lower confidence intervals can be returned for each sample of the dataframe X.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
pd.DataFrame |
data frame with samples to generate predictions |
required |
return_ci |
Bool |
whether to include confidence intervals |
False |
ci_width |
Float |
width of confidence interval |
0.683 |
Returns:
Type | Description |
---|---|
([(pd.DataFrame, np.array, np.array), pd.DataFrame]) |
preds_df: A dataframe of survival probabilities for all times (columns), from a time_bins array, for all samples of X (rows). If return_interval_probs is True, the interval probabilities are returned instead of the cumulative survival probabilities. upper_ci: Upper confidence interval for the survival probability values lower_ci: Lower confidence interval for the survival probability values |