Documentation - Gremlin by EvalOps

Documentation

Build evaluation‑nativeAI with Gremlin

Everything you need to integrate Gremlin into your AI workflows. From quickstart guides to advanced evaluation patterns, our documentation helps you build reliable, measurable AI systems with confidence.

Coming soon

Getting started

From zero to evaluation in minutes

Get up and running with Gremlin in just a few steps. Our quickstart guide walks you through installation, basic configuration, and your first evaluation to demonstrate the power of evaluation-native AI development.

01 ::

5‑minute quickstart

Install the SDK, configure your first evaluator, and run evaluations on sample data.

02 ::

Interactive tutorials

Step-by-step walkthroughs with executable code examples and real evaluation scenarios.

03 ::

Example projects

Complete reference implementations for common AI use cases and evaluation patterns.

04 ::

Migration guides

Seamlessly migrate from existing evaluation frameworks with detailed migration paths.

Core concepts

Understanding evaluation‑native AI

Master the fundamental concepts that make Gremlin different. Understanding these building blocks will help you design more effective evaluation strategies and build more reliable AI systems.

01 ::

Evaluation primitives

Learn about assertions, judges, golden sets, and how they compose into powerful evaluation workflows.

02 ::

Agent control loops

Understand how evaluations integrate directly into your agent's decision-making process.

03 ::

Metrics and scoring

Design meaningful metrics that capture accuracy, safety, cost, and performance dimensions.

04 ::

Guardrails and policies

Implement safety checks, fallback strategies, and policy enforcement at every step.

API Reference

Comprehensive API documentation

Complete reference documentation for all Gremlin APIs, SDKs, and integration points. Explore endpoints, authentication, request/response formats, and error handling with interactive examples.

SDKs & Tools

Language‑specific guides

Use Gremlin in your preferred programming language. Our SDKs provide idiomatic interfaces that feel natural while maintaining consistency across different development environments.

01 ::

Python SDK

Native Python integration with async support, type hints, and Jupyter notebook compatibility.

02 ::

TypeScript/JavaScript

Full-featured Node.js and browser support with TypeScript definitions and React hooks.

03 ::

Go SDK

High-performance Go client with context support, structured logging, and middleware patterns.

04 ::

REST API

Language-agnostic HTTP API with OpenAPI specifications and comprehensive curl examples.

Integration patterns

Connect with your existing stack

Gremlin integrates seamlessly with popular AI frameworks and tools. Learn how to add evaluation-native capabilities to your existing workflows without major architectural changes.

01 ::

LangChain integration

Drop-in evaluators for LangChain agents with automatic chain instrumentation and callback handling.

02 ::

LlamaIndex support

Evaluate retrieval quality, response relevance, and query performance in RAG applications.

03 ::

MLOps platforms

Connect with MLflow, Weights & Biases, Neptune, and other experiment tracking platforms.

04 ::

CI/CD pipelines

Automated evaluation in GitHub Actions, Jenkins, and other CI systems with detailed reporting.

Advanced topics

Expert‑level evaluation techniques

Deep dive into advanced evaluation patterns for production AI systems. Learn how to handle edge cases, scale to enterprise workloads, and maintain evaluation quality as your system grows.

01 ::

Custom evaluators

Build domain-specific evaluators with custom metrics, scoring functions, and validation logic.

02 ::

Distributed evaluation

Scale evaluations across multiple workers with load balancing and fault tolerance.

03 ::

A/B testing frameworks

Statistical significance testing, experiment design, and gradual rollout strategies.

04 ::

Production monitoring

Real-time evaluation monitoring, alerting, and automated incident response workflows.

Use cases & examples

Real‑world applications

Explore detailed examples of how teams use Gremlin to solve real evaluation challenges. Each use case includes complete code examples, evaluation strategies, and lessons learned from production deployments.

Code examples

See Gremlin in action

Interactive code examples you can run locally or in our playground. Start with simple evaluations and progress to complex multi-step agent workflows with full observability.

Explore the documentation

Complete guides coming soon - join our waitlist for early access

Coming Soon

⚡

Quickstart Guide

Get up and running with your first evaluation in under 5 minutes.

Tutorial

Coming Soon

📖

API Reference

Complete API documentation with interactive examples and schemas.

Reference

Coming Soon

🐍

Python SDK

Native Python integration with async support and type safety.

SDK Guide

Coming Soon

📦

JavaScript SDK

Full-featured Node.js and browser support with TypeScript definitions.

SDK Guide

Coming Soon

🔗

LangChain Integration

Add evaluation to your LangChain agents with drop-in evaluators.

Integration

Coming Soon

🚀

Production Deployment

Best practices for deploying evaluations in production environments.

Guide

Community

Get help when you need it

Join our community of developers building evaluation-native AI. Get help with implementation, share best practices, and collaborate on the future of AI evaluation.

Discord community

Coming soon - Join developers discussing AI evaluation

GitHub discussions

Coming soon - Ask questions and share knowledge

Support

Email us for help and questions

Documentation stats

150+

Code examples

50+

Integration guides

25+

Use case studies

99%

Uptime

← Back to Gremlin

Security Build with us Careers