Ensemble of anomaly scores

Contents

Ensemble of anomaly scores#

Due to the diversity of different time series data, and the uncertain generalization of different detectors. This function can help you to ensemble anomaly scores from different models.

For different detectors, the anomaly score can be distributed in different ranges. Thus, we need to call calibrator which can normalize the scores into [0,1] before this ensemble module.

Weight Ensemble#

This ensemble method is based on weighting the anomaly scores. It requires a list of weights of each model to initialize and calculate the weighted anomaly score with strict scores order.

from streamad.util import StreamGenerator, UnivariateDS, plot
from streamad.model import KNNDetector,SpotDetector
from streamad.process import ZScoreCalibrator, WeightEnsemble


ds = UnivariateDS()
stream = StreamGenerator(ds.data)
knn_detector=KNNDetector()
spot_detector=SpotDetector()
knn_calibrator = ZScoreCalibrator(sigma=3)
spot_calibrator = ZScoreCalibrator(sigma=3)
ensemble = WeightEnsemble(ensemble_weights=[0.6, 0.4])

scores = []

for x in stream.iter_item():
    # We first score the anomalies using different detectors
    knn_score = knn_detector.fit_score(x)
    spot_score = spot_detector.fit_score(x)

    # Then we calibrate the scores into normalized [0,1]
    knn_normalized_score = knn_calibrator.normalize(knn_score)
    spot_normalized_score = spot_calibrator.normalize(spot_score)

    # Finally we ensemble the scores
    score = ensemble.ensemble([knn_normalized_score, spot_normalized_score])
    scores.append(score)

data, label, date, features = ds.data, ds.label, ds.date, ds.features
plot(data=data,scores=scores,date=date,features=features,label=label)

Vote Ensemble#

This ensemble method is based on votes. It requires a threshold to determine the anomalies for each detector, and report the overall anomalies with a principle of majority votes.

from streamad.util import StreamGenerator, UnivariateDS, plot
from streamad.model import KNNDetector,SpotDetector
from streamad.process import ZScoreCalibrator, VoteEnsemble


ds = UnivariateDS()
stream = StreamGenerator(ds.data)
knn_detector=KNNDetector()
spot_detector=SpotDetector()
knn_calibrator = ZScoreCalibrator(sigma=2)
spot_calibrator = ZScoreCalibrator(sigma=2)
ensemble = VoteEnsemble(threshold=0.8)

scores = []

for x in stream.iter_item():
    # We first score the anomalies using different detectors
    knn_score = knn_detector.fit_score(x)
    spot_score = spot_detector.fit_score(x)

    # Then we calibrate the scores into normalized [0,1]
    knn_normalized_score = knn_calibrator.normalize(knn_score)
    spot_normalized_score = spot_calibrator.normalize(spot_score)

    # Finally we ensemble the scores
    score = ensemble.ensemble([knn_normalized_score, spot_normalized_score])
    scores.append(score)

data, label, date, features = ds.data, ds.label, ds.date, ds.features
plot(data=data,scores=scores,date=date,features=features,label=label)