Skip to content

Conversation

@Nithurshen
Copy link
Contributor

@Nithurshen Nithurshen commented Nov 16, 2025

Reference Issues/PRs

Fixes #2930

What does this implement/fix? Explain your changes.

This PR implements the Multiview Enhanced Characteristics (Mecha) Classifier in aeon.classification.feature_based, based on the recently accepted paper "Mecha: Multiview Enhanced Characteristics via Series Shuffling for Time Series Classification and Its Application to Turntable Circuit".

Mecha is an upgraded, feature-based Time Series Classification (TSC) algorithm that enhances diversity and expressiveness through three main components:

  1. Diverse Feature Extractor: Enhances global and local patterns via series shuffling mapping (bidirectional dilation and interleaving) on both the raw series and a Tracking Differentiator (TD) transformed series. The TD filter factor is optimized via Grey Wolf Optimizer (GWO), using Silhouette Score as the objective function. The underlying feature sets are extracted using Catch22.
  2. Ensemble Feature Selector: Adaptively selects stable and diverse feature subsets using a mechanism based on the ratio of feature stability and diversity scores, across different feature views (e.g., Mutual Information, F-statistic, and their intersection).
  3. Heterogeneous Ensemble Classifier: Integrates multiple views (derived from the adaptive selection) and classifiers (Ridge Regression with cross-validation and Extremely Randomized Trees) using hard voting for final prediction.

This PR includes:

  • MechaClassifier in aeon.classification.feature_based.
  • Associated utility functions (series_transform, dilated_fres_extract, interleaved_fres_extract, hard_voting, _adaptive_saving_features, _gwo, _objective_function) in aeon.transformations.collection.feature_based._mecha_feature_extractor.py (and the necessary imports in __init__.py).
  • Unit tests for the new classifier and its core utilities.

Mecha is a feature-based ensemble that significantly upgrades the concepts introduced in TD-MVDC..

Does your contribution introduce a new dependency? If yes, which one?

No

Any other comments?

Reference:
[1] Changchun He, Xin Huo, Baohan Mi, and Songlin Chen. "Mecha: Multiview Enhanced Characteristics via Series Shuffling for Time Series Classification and Its Application to Turntable circuit", IEEE Transactions on Circuits and Systems I: Regular Papers, 2025.

PR checklist

For all contributions
  • [] I've added myself to the list of contributors. Alternatively, you can use the @all-contributors bot to do this for you after the PR has been merged.
  • The PR title starts with either [ENH], [MNT], [DOC], [BUG], [REF], [DEP] or [GOV] indicating whether the PR topic is related to enhancement, maintenance, documentation, bugs, refactoring, deprecation or governance.
For new estimators and functions
  • I've added the estimator/function to the online API documentation.
  • (OPTIONAL) I've added myself as a __maintainer__ at the top of relevant files and want to be contacted regarding its maintenance. Unmaintained files may be removed. This is for the full file, and you should not add yourself if you are just making minor changes or do not want to help maintain its contents.
For developers with write access
  • (OPTIONAL) I've updated aeon's CODEOWNERS to receive notifications about future changes to these files.

@aeon-actions-bot aeon-actions-bot bot added enhancement New feature, improvement request or other non-bug code enhancement transformations Transformations package labels Nov 16, 2025
@aeon-actions-bot
Copy link
Contributor

Thank you for contributing to aeon

I have added the following labels to this PR based on the title: [ enhancement ].
I have added the following labels to this PR based on the changes made: [ transformations ]. Feel free to change these if they do not properly represent the PR.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

  • Run pre-commit checks for all files
  • Run mypy typecheck tests
  • Run all pytest tests and configurations
  • Run all notebook example tests
  • Run numba-disabled codecov tests
  • Stop automatic pre-commit fixes (always disabled for drafts)
  • Disable numba cache loading
  • Regenerate expected results for testing
  • Push an empty commit to re-run CI checks

@Nithurshen
Copy link
Contributor Author

Nithurshen commented Nov 17, 2025

The CI pipeline is currently showing a failure (PR pytest / pytest (windows-2022, 3.10, true) (pull_request)) that is external to my code:
FAILED aeon/clustering/averaging/tests/test_kasba.py::test_kasba_distance_params[distance7] - TypeError: bad argument type for built-in operation
This looks like a regression failure in the KASBA Barycenter Averaging tests (soft_dtw distance).
image (1)
image

@CCHe64
Copy link

CCHe64 commented Nov 17, 2025

I will experiment to see if the accuracy of your AEON implementation version is at the same level as the accuracy in the original paper.

@CCHe64
Copy link

CCHe64 commented Nov 17, 2025

I recommend that you use the basic feature extractor as a configurable parameter. Users can freely switch between using Catch22 or TSFresh.Catch22 and TSFresh can both directly use the functions in aeon.

@CCHe64
Copy link

CCHe64 commented Nov 19, 2025

图片 Please test the classification accuracy of the code after running to see if it is around 85.7%. Please set the default mode of TSFresh to 'efficient'. After changing the "minimal" mode of TSFresh in the existing code to "efficient", the accuracy on ArrowHead is only 74.8%. I am currently checking the code. Please also run the accuracy on this dataset after modifying the 'efficiency' mode.

@CCHe64
Copy link

CCHe64 commented Nov 19, 2025

图片 According to my inspection, you have rewritten it def bidirect_interleaving_mapping(seriesX: np.ndarray, max_rate: int = 16) -> np.ndarray: The output of the function is incorrect. The output index is almost the same at different shuffling rates. Please check the output of this function. And check through the inverse mapping relationship between the two functions.

@Nithurshen
Copy link
Contributor Author

Nithurshen commented Nov 19, 2025

@CCHe64, I have corrected bidirect_interleaving_mapping function, and have changed TSFresh to "efficient" and have run the following script:

import numpy as np
from aeon.classification.feature_based import MechaClassifier
from aeon.datasets import load_arrow_head
from aeon.transformations.collection.feature_based._mecha_feature_extractor import bidirect_dilation_mapping, bidirect_interleaving_mapping

trainSeriesX, trainY = load_arrow_head("TRAIN")
testSeriesX, testY = load_arrow_head("TEST")
trainY, testY = trainY.astype(int), testY.astype(int)

# 2. Classification
mecha = MechaClassifier(random_state=0, basic_extractor="TSFresh")
mecha.fit(trainSeriesX, trainY)
testPY = mecha.predict(testSeriesX)

# 3. Result
accV = np.sum(testPY==testY) / len(testY)
print("Accuracy :", accV)

seriesX = np.zeros((5, 1, 20))
indexList0 = bidirect_dilation_mapping(seriesX)
indexList1 = bidirect_interleaving_mapping(seriesX)

print('Bidirectional Dilation Mapping')
print(indexList0)
print('Bidirectional Interleaving Mapping')
print(indexList1)

And I was able to only get an accuracy of 84%.

I have also given the output for you to verify the working of bidirect_interleaving_mapping.

Accuracy : 0.84
Bidirectional Dilation Mapping
[[ 0  2  4  6  8 10 12 14 16 18  1  3  5  7  9 11 13 15 17 19]
 [ 1  3  5  7  9 11 13 15 17 19  0  2  4  6  8 10 12 14 16 18]]
Bidirectional Interleaving Mapping
[[ 0 10  1 11  2 12  3 13  4 14  5 15  6 16  7 17  8 18  9 19]
 [10  0 11  1 12  2 13  3 14  4 15  5 16  6 17  7 18  8 19  9]]

@CCHe64
Copy link

CCHe64 commented Nov 19, 2025

I will run the accuracy on other UCR datasets. In addition, I recommend using the same settings as in the original paper: default to TSFresh instead of Catch22

@CCHe64
Copy link

CCHe64 commented Nov 19, 2025

In the original paper, 85.7% on the ArrowHead dataset were obtained using TSFresh

@Nithurshen
Copy link
Contributor Author

I am sorry, the previous test was ran on "Catch22". Now I have updated it with "TSFresh". But still I can only achieve 84% accuracy. Can you try it once?

@CCHe64
Copy link

CCHe64 commented Nov 19, 2025

I am sorry, the previous test was ran on "Catch22". Now I have updated it with "TSFresh". But still I can only achieve 84% accuracy. Can you try it once?

This 85.7% value is the average of 10 different random seeds. You can check the average of 10 times.

@Nithurshen
Copy link
Contributor Author

Nithurshen commented Nov 19, 2025

I was able to achieve 84.57% when using TSFresh in 'comprehensive' mode.

Accuracy : 0.8457142857142858
Bidirectional Dilation Mapping
[[ 0  2  4  6  8 10 12 14 16 18  1  3  5  7  9 11 13 15 17 19]
 [ 1  3  5  7  9 11 13 15 17 19  0  2  4  6  8 10 12 14 16 18]]
Bidirectional Interleaving Mapping
[[ 0 10  1 11  2 12  3 13  4 14  5 15  6 16  7 17  8 18  9 19]
 [10  0 11  1 12  2 13  3 14  4 15  5 16  6 17  7 18  8 19  9]]

I will try running it in 'efficient' mode with 10 random seeds

@Nithurshen
Copy link
Contributor Author

@CCHe64, When ran with 10 random seeds with the following code

import numpy as np
from aeon.classification.feature_based import MechaClassifier
from aeon.datasets import load_arrow_head

trainSeriesX, trainY = load_arrow_head("TRAIN")
testSeriesX, testY = load_arrow_head("TEST")
trainY, testY = trainY.astype(int), testY.astype(int)

accuracy_scores = []
num_runs = 10

print(f"Starting classification for {num_runs} random seeds...")

for i in range(num_runs):
    seed = np.random.randint(0, 10000)
    mecha = MechaClassifier(random_state=seed, basic_extractor="TSFresh")
    mecha.fit(trainSeriesX, trainY)
    testPY = mecha.predict(testSeriesX)

    accV = np.sum(testPY == testY) / len(testY)
    accuracy_scores.append(accV)
    print(f"  Accuracy for seed {seed}: {accV:.4f}")

average_accuracy = np.mean(accuracy_scores)
std_dev_accuracy = np.std(accuracy_scores)

print("\n---")
print(f"Number of runs: **{num_runs}**")
print(f"Individual Accuracies: {np.array(accuracy_scores)}")
print(f"Average Accuracy: **{average_accuracy:.4f}**")
print(f"Standard Deviation: **{std_dev_accuracy:.4f}**")
print("---")

I was able to achieve 85.71% accuracy for one of the seeds

Starting classification for 10 random seeds...
  Accuracy for seed 4586: 0.8457
  Accuracy for seed 2299: 0.8457
  Accuracy for seed 3211: 0.8400
  Accuracy for seed 2538: 0.8571
  Accuracy for seed 6398: 0.8400
  Accuracy for seed 4217: 0.8457
  Accuracy for seed 6466: 0.8400
  Accuracy for seed 5865: 0.8457
  Accuracy for seed 6305: 0.8457
  Accuracy for seed 9859: 0.8514

---
Number of runs: **10**
Individual Accuracies: [0.84571429 0.84571429 0.84       0.85714286 0.84       0.84571429
 0.84       0.84571429 0.84571429 0.85142857]
Average Accuracy: **0.8457**
Standard Deviation: **0.0051**
---

@CCHe64
Copy link

CCHe64 commented Nov 19, 2025

@CCHe64, When ran with 10 random seeds with the following code

import numpy as np
from aeon.classification.feature_based import MechaClassifier
from aeon.datasets import load_arrow_head

trainSeriesX, trainY = load_arrow_head("TRAIN")
testSeriesX, testY = load_arrow_head("TEST")
trainY, testY = trainY.astype(int), testY.astype(int)

accuracy_scores = []
num_runs = 10

print(f"Starting classification for {num_runs} random seeds...")

for i in range(num_runs):
    seed = np.random.randint(0, 10000)
    mecha = MechaClassifier(random_state=seed, basic_extractor="TSFresh")
    mecha.fit(trainSeriesX, trainY)
    testPY = mecha.predict(testSeriesX)

    accV = np.sum(testPY == testY) / len(testY)
    accuracy_scores.append(accV)
    print(f"  Accuracy for seed {seed}: {accV:.4f}")

average_accuracy = np.mean(accuracy_scores)
std_dev_accuracy = np.std(accuracy_scores)

print("\n---")
print(f"Number of runs: **{num_runs}**")
print(f"Individual Accuracies: {np.array(accuracy_scores)}")
print(f"Average Accuracy: **{average_accuracy:.4f}**")
print(f"Standard Deviation: **{std_dev_accuracy:.4f}**")
print("---")

I was able to achieve 85.71% accuracy for one of the seeds

Starting classification for 10 random seeds...
  Accuracy for seed 4586: 0.8457
  Accuracy for seed 2299: 0.8457
  Accuracy for seed 3211: 0.8400
  Accuracy for seed 2538: 0.8571
  Accuracy for seed 6398: 0.8400
  Accuracy for seed 4217: 0.8457
  Accuracy for seed 6466: 0.8400
  Accuracy for seed 5865: 0.8457
  Accuracy for seed 6305: 0.8457
  Accuracy for seed 9859: 0.8514

---
Number of runs: **10**
Individual Accuracies: [0.84571429 0.84571429 0.84       0.85714286 0.84       0.84571429
 0.84       0.84571429 0.84571429 0.85142857]
Average Accuracy: **0.8457**
Standard Deviation: **0.0051**
---

This is within an acceptable range. I will run the current version 10 times on 112 datasets to check the accuracy. This is time-consuming and will take about one or two days.

@Nithurshen
Copy link
Contributor Author

@CCHe64, Make sure you specify basic_extractor="TSFresh" while running the tests, as Catch22 is still the default extractor. I haven't updated it yet because it is a soft-dependency, and I have to work on that to pass the lint tests.

@CCHe64
Copy link

CCHe64 commented Nov 20, 2025

@CCHe64, Make sure you specify basic_extractor="TSFresh" while running the tests, as Catch22 is still the default extractor. I haven't updated it yet because it is a soft-dependency, and I have to work on that to pass the lint tests.

ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature, improvement request or other non-bug code enhancement transformations Transformations package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] Implement the feature-based Multiview Enhanced Characteristics (Mecha) classification algorithm

2 participants