A Python toolkit for testing machine learning models. You can run tests on your computer or scale them up using Kubernetes.
- 🔄 Multiple Testing Tools: Works with LM Evaluation Harness, RAGAS, and other testing tools
- ☸️ Works with Kubernetes: Scale up tests on Kubernetes with TrustyAI Operator
- 🖥️ Run Anywhere: Test models on your computer or spread across cluster nodes
- 🛡️ Team Ready: Built-in checks, monitoring, and works with OpenDataHub
- 🎯 Easy to Use: Command line tools and Python code for different workflows
trustyai eval execute \
--provider lm-eval-harness \
--execution-mode local \
--model "microsoft/DialoGPT-medium" \
--tasks "hellaswag,arc_easy" \
--limit 10trustyai eval execute \
--provider lm-eval-harness \
--execution-mode kubernetes \
--model "microsoft/DialoGPT-medium" \
--tasks "hellaswag,arc_easy" \
--namespace trustyai-eval \
--cpu 4 \
--memory 8Gi \
--limit 50Quick links:
Install the package with core functionality and CLI:
pip install .After installation, you can use both the Python API and CLI:
trustyai --help
trustyai info
trustyai model list
trustyai eval list-providersTo install everything including evaluation support:
pip install .[all]This includes all core, CLI, and evaluation dependencies.
For model evaluation capabilities:
pip install .[eval]For development, testing, and linting:
pip install .[dev]import numpy as np
from trustyai.core.model import TrustyModel
from trustyai.core.providers import ProviderRegistry
# Create a trusty model
model = TrustyModel(name="MyModel")
# Get explanations
X = np.random.rand(10, 5)
explanations = model.explain(X)
print(explanations)The CLI is available by default after installation:
# Display help
trustyai --help
# Show version information
trustyai --version
# Show general information
trustyai info
# List available models
trustyai model list
# List evaluation providers
trustyai eval list-providers
# List available validators
trustyai validators list
# Run a validator
trustyai validators run python-version- API Documentation - Comprehensive API reference including validators
This project uses:
# Install development dependencies
pip install .[dev]
# Run tests
make test
# Run linting
make lint
# Format code
make formatApache License 2.0