OpenClaw v2026.3.24-beta.1 Released with OpenAI Compatibility Improvements

Q: What OpenAI-compatible endpoints does OpenClaw v2026.3.24-beta.1 add?

This release adds `/v1/models` for model discovery and `/v1/embeddings` for vector generation. It also forward explicit model overrides through `/v1/chat/completions` and `/v1/responses`, allowing OpenAI SDK clients to specify models like `gpt-4o` or custom local equivalents without gateway errors. The endpoints support standard authentication headers and return JSON schemas identical to OpenAI's REST API.

Q: How does the new embedding endpoint help with RAG systems?

The `/v1/embeddings` endpoint lets you generate vectors using the same API shape as OpenAI, meaning existing RAG frameworks like LangChain, LlamaIndex, or custom Python scripts work without modification. Point your `OPENAI_BASE_URL` to your OpenClaw gateway and your retrieval pipeline just works. The endpoint handles batching automatically and returns usage statistics compatible with OpenAI's token counting.

Q: Can I use the standard OpenAI Python SDK with OpenClaw now?

Yes. With the new endpoints and model override forwarding, you can set `base_url` to your OpenClaw instance and use `openai.ChatCompletion.create()` or the Responses API. The gateway translates requests to your configured local or remote models while maintaining OpenAI's JSON schema. Streaming responses work via SSE, and error codes map to standard OpenAI HTTP status codes for compatibility with existing retry logic.

Q: What changed in the Microsoft Teams integration?

The Teams integration migrated to the official Microsoft Teams SDK and added AI-agent UX best practices: streaming 1:1 replies, welcome cards with prompt starters, feedback mechanisms, typing indicators, and native AI labeling. It also supports message editing and deletion with in-thread fallbacks. These changes require updating your Azure Bot Service configuration to use the new webhook endpoints.

Q: How do I upgrade to v2026.3.24-beta.1?

Run `openclaw update beta` or pull the Docker image `openclaw/openclaw:2026.3.24-beta.1`. Check your gateway configuration for new environment variables like `OPENCLAW_GATEWAY_OPENAI_COMPAT=true`. Review the Control UI for new skill installation workflows. If using Microsoft Teams, regenerate your bot manifest using the provided Teams SDK schema templates.

OpenClaw v2026.3.24-beta.1 landed yesterday with a gateway overhaul that finally bridges the compatibility gap between OpenClaw’s native agent runtime and the broader OpenAI ecosystem. This release introduces /v1/models and /v1/embeddings endpoints alongside explicit model override forwarding, transforming OpenClaw from a standalone agent framework into a reliable OpenAI API replacement for RAG pipelines and existing client libraries. If you have Python scripts, LangChain chains, or third-party tools hardcoded to hit OpenAI’s REST endpoints, you can now point them at OpenClaw without rewriting a single line of request logic. The beta also ships Teams SDK migration, one-click skill installs, and container-native CLI flags that streamline production deployments. These enhancements aim to make OpenClaw a more versatile and integrated solution for AI development and deployment.

What Key Features Shipped in OpenClaw v2026.3.24-beta.1?

The changelog for v2026.3.24-beta.1 is dense with gateway improvements and platform hardening. The headline feature is OpenAI compatibility: new REST endpoints at /v1/models and /v1/embeddings join the existing chat completions gateway, plus the system now forwards explicit model overrides through both /v1/chat/completions and /v1/responses. Microsoft Teams gets a complete overhaul using the official Teams SDK with streaming replies, welcome cards, and native AI labeling. The Control UI adds status-filter tabs for skills and one-click install recipes for bundled tools like coding-agent and openai-whisper-api. Slack interactive replies return with rich formatting parity, and the CLI picks up a --container flag for executing commands inside running Docker or Podman instances. Discord channels can now auto-thread agent conversations. These changes reflect a shift toward enterprise integration and developer ergonomics, making OpenClaw more accessible and powerful for a wider range of use cases.

Why OpenAI Compatibility Significantly Enhances RAG Integration

Before this release, using OpenClaw as a backend for existing RAG frameworks required awkward adapter layers or proxy scripts. The new OpenAI-compatible endpoints eliminate that friction entirely. When your vector database expects to call openai.embeddings.create() to generate query vectors, OpenClaw now presents the exact same interface. This matters because most production RAG systems are not greenfield projects; they are existing Python services with hardcoded OpenAI imports. By implementing the /v1/embeddings schema, OpenClaw lets you swap the base URL and immediately route traffic through your local models or custom providers without touching application code. The model override forwarding ensures that when your RAG pipeline requests text-embedding-3-large, OpenClaw maps that to your configured local embedding model while preserving the API contract, simplifying your RAG architecture.

This compatibility is crucial for accelerating the adoption of OpenClaw in environments where existing infrastructure is already tied to OpenAI’s ecosystem. Enterprises can now leverage OpenClaw’s flexibility and control over their AI models without a costly migration or refactoring of their RAG pipelines. This also opens up possibilities for hybrid RAG architectures, where some embedding tasks might be handled by cloud providers while others are processed locally through OpenClaw, all managed under a unified API interface. The ability to seamlessly switch between providers is a major advantage for organizations seeking vendor independence and cost optimization.

Exploring the Capabilities of the New /v1/models Endpoint

The /v1/models endpoint returns a JSON list of available models in OpenAI’s format, complete with id, object, created, and owned_by fields. This is not just cosmetic. Many client libraries and UI frameworks query this endpoint to populate model selection dropdowns or validate configurations before sending requests. OpenClaw now exposes your configured LLMs and embedding models through this route, allowing tools like the OpenAI Playground, Continue.dev, or custom React frontends to discover OpenClaw capabilities dynamically. You can configure model metadata in gateway.yaml to expose specific versions of Qwen, Llama, or proprietary endpoints under friendly names like claw-gpt-4o. The endpoint supports filtering by model type and pagination for deployments with dozens of model variants, offering comprehensive model management.

This dynamic model discovery feature is particularly useful for developers building AI applications that need to adapt to different underlying language models. Instead of hardcoding model names, applications can query the /v1/models endpoint to get an up-to-date list of available OpenClaw-managed models. This makes applications more resilient to changes in model availability and allows for easier experimentation with new models as they become integrated into OpenClaw. Furthermore, the owned_by field can be customized to indicate whether a model is locally hosted, provided by a specific cloud vendor, or a custom fine-tuned variant, adding transparency and control.

How the /v1/embeddings Endpoint Powers Vector Search Workflows

Vector search is the backbone of modern RAG, and the new /v1/embeddings endpoint handles the heavy lifting. It accepts the standard OpenAI request body with input, model, and optional encoding_format parameters, returning a JSON object containing the embedding vectors and token usage statistics. Under the hood, OpenClaw routes these requests to your configured embedding provider, whether that is a local Ollama instance, a vLLM server, or a cloud API. The response format matches OpenAI’s exactly, meaning libraries like langchain-openai or llama-index-embeddings-openai work without subclassing or monkey-patching. You can generate embeddings for document chunks during indexing and query embeddings during retrieval using the same HTTP client configuration you used for OpenAI. This consistency streamlines the development and deployment of RAG systems.

The flexibility of routing to various embedding providers means that developers are not locked into a single technology. They can choose the most cost-effective or performant embedding model for their specific use case, whether it’s a large, high-accuracy model for critical applications or a smaller, faster model for less demanding tasks. The token usage statistics provided in the response are also invaluable for cost tracking and optimization, especially when dealing with commercial embedding APIs. This detailed reporting allows for better resource management and more predictable operational costs for large-scale RAG deployments.

Addressing the Gateway Gap with Explicit Model Override Forwarding

Previous OpenClaw versions struggled with explicit model parameters sent by picky clients. If a request arrived specifying model: "gpt-4o" but your gateway was configured for a local Qwen instance, the system might ignore the override or return a model mismatch error. v2026.3.24-beta.1 fixes this by forwarding explicit model overrides through both /v1/chat/completions and /v1/responses. When OpenClaw receives a request with a model name it recognizes in its routing table, it dynamically switches to that backend for the duration of the request. This allows a single OpenClaw gateway to serve multiple model families simultaneously while maintaining OpenAI SDK compatibility. You can configure fallback chains so that if gpt-4o is requested but unavailable, it routes to claude-3-opus or a local equivalent, ensuring robust service delivery.

This feature is particularly beneficial for organizations that need to support a diverse set of AI models for different departments or projects. A single OpenClaw instance can now act as a unified API gateway, directing traffic to the appropriate model based on the client’s request. This simplifies infrastructure management and reduces the need for multiple, specialized gateways. The ability to define fallback chains adds an important layer of resilience, ensuring that AI services remain operational even if a primary model becomes unavailable. This intelligent routing mechanism significantly improves the reliability and adaptability of OpenClaw-powered AI applications.

Integrating LangChain and LlamaIndex with OpenClaw: Practical Examples

Practical integration with popular RAG frameworks like LangChain and LlamaIndex is now straightforward. For LangChain, you simply need to set the environment variables OPENAI_API_KEY and OPENAI_BASE_URL before initializing your models. The OPENAI_API_KEY can be any placeholder string as OpenClaw’s gateway typically handles authentication internally or via its own API key mechanism, but the SDK requires it to be present.

import os
os.environ["OPENAI_API_KEY"] = "sk-openclaw-local" # Placeholder, OpenClaw manages actual auth
os.environ["OPENAI_BASE_URL"] = "http://localhost:7378/v1" # Your OpenClaw Gateway URL

from langchain_openai import OpenAIEmbeddings, ChatOpenAI

# Initialize embedding model, this will route through OpenClaw's /v1/embeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

# Initialize chat model, this will route through OpenClaw's /v1/chat/completions
llm = ChatOpenAI(model="gpt-4o")

# Example usage:
# response = llm.invoke("What is the capital of France?")
# print(response.content)

# embedding_vector = embeddings.embed_query("Hello world")
# print(embedding_vector[:5]) # Print first 5 elements of the vector

For LlamaIndex, the configuration involves setting the api_base and api_key parameters when instantiating OpenAIEmbedding and OpenAI classes.

from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

# Configure global settings for LlamaIndex to use OpenClaw
Settings.api_base = "http://localhost:7378/v1"
Settings.api_key = "sk-openclaw-local" # Placeholder for LlamaIndex

# Initialize embedding model
embed_model = OpenAIEmbedding(
    model="text-embedding-3-large",
    api_base="http://localhost:7378/v1", # Explicitly set for this instance
    api_key="sk-openclaw-local"
)

# Initialize LLM
llm = OpenAI(
    model="gpt-4o",
    api_base="http://localhost:7378/v1", # Explicitly set for this instance
    api_key="sk-openclaw-local"
)

# Example usage:
# from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
# documents = SimpleDirectoryReader("data").load_data()
# index = VectorStoreIndex.from_documents(documents, embed_model=embed_model)
# query_engine = index.as_query_engine(llm=llm)
# response = query_engine.query("What is the main topic of the documents?")
# print(response)

Both frameworks now treat OpenClaw as a first-class OpenAI provider, generating vectors and chat completions through your local infrastructure. This eliminates the need for custom wrappers or complex integrations, allowing developers to leverage the full power of these frameworks with OpenClaw’s flexible backend.

Enhancements to Microsoft Teams Integration and AI-Agent UX Patterns

The Microsoft Teams integration departed from its legacy webhook implementation and now uses the official Teams SDK. This unlocks modern AI-agent UX patterns that users expect from Copilot or ChatGPT. The bot now streams 1:1 replies token-by-token instead of sending block messages, reducing perceived latency. Welcome cards display prompt starters to guide new users, and feedback buttons allow users to flag incorrect responses for review. Typing indicators show when the agent is processing, and native AI labeling clearly marks messages as AI-generated for compliance. Message editing and deletion are now supported, including in-thread fallbacks when the original message context is ambiguous. These changes make OpenClaw agents feel like native Teams participants rather than external webhook bots, significantly improving user experience and trust.

The move to the official Teams SDK also provides greater stability and access to future platform features. Developers can now build more sophisticated interactive experiences directly within Teams, such as multi-step forms, adaptive cards for data entry, and integrated approval workflows. The native AI labeling is particularly important for regulatory compliance and user transparency, ensuring that users are always aware when they are interacting with an AI system. This comprehensive overhaul positions OpenClaw as a robust platform for deploying enterprise-grade AI agents within Microsoft Teams environments.

Streamlined One-Click Skill Installation with Dependency Resolution

Setting up agent skills used to require manual dependency hunting. If you wanted the coding-agent skill but lacked ripgrep or a specific Python package, the skill would fail silently or throw cryptic errors. v2026.3.24-beta.1 introduces one-click install recipes bundled with core skills like coding-agent, gh-issues, openai-whisper-api, session-logs, tmux, trello, and weather. These recipes declare system dependencies, Python packages, and API key requirements in a manifest.json file. When you enable a skill in the Control UI or CLI, OpenClaw checks for missing requirements and offers to install them automatically. This works across package managers: apt on Ubuntu, brew on macOS, and choco on Windows. The CLI command openclaw skills install coding-agent --resolve-deps handles the entire setup pipeline, greatly simplifying skill management.

This automated dependency resolution is a significant quality-of-life improvement for both new and experienced OpenClaw users. It reduces the time and effort required to get new skills up and running, minimizing common installation errors. For organizations, it ensures consistency across different deployment environments, as the same skill manifest can be used to set up agents on various operating systems. This feature also promotes a more modular approach to agent development, where skills can be easily shared and deployed without complex manual configuration steps.

Enhanced Real-Time Tool Visibility in the Control UI

The /tools endpoint and Control UI now show exactly which tools the current agent can access at this moment, not just which skills are enabled. This distinction matters because tool availability depends on runtime state: API keys configured, network connectivity, and container permissions. The UI adds a compact default view listing tool names and categories, with an expandable detailed mode showing parameters, return types, and example invocations. A new “Available Right Now” section highlights tools that are fully configured and ready for immediate use, while graying out tools missing dependencies. This prevents the frustrating experience of asking an agent to use a tool only to discover mid-conversation that the OAuth token expired or the Docker socket is unreachable, providing greater transparency and debuggability.

This real-time visibility is invaluable for troubleshooting and ensuring agents operate as expected. Developers and administrators can quickly ascertain the operational status of all agent tools, identifying any missing configurations or connectivity issues before they impact agent performance. The detailed view with parameters and example invocations also serves as excellent documentation, making it easier to understand how each tool functions and how agents might interact with them. This level of insight enhances confidence in agent deployments and reduces the time spent diagnosing tool-related problems.

Achieving Slack Interactive Components and Reply Parity

Slack integration regained rich reply parity for direct deliveries. When an OpenClaw agent sends a message to a Slack channel, it can now include interactive blocks, buttons, and select menus that render correctly in the Slack client. The system auto-detects simple trailing Options: lines in agent responses and renders them as native Slack buttons or dropdown selects without requiring manual block construction. Setup defaults improved for interactive mode, isolating reply controls from plugin interactive handlers to prevent event loop conflicts. This means you can build approval workflows or multi-step forms in OpenClaw and have them render natively in Slack, with user interactions routed back to the agent context seamlessly, creating a more integrated and dynamic user experience within Slack.

The ability to generate native Slack interactive components directly from agent responses dramatically improves the utility of OpenClaw agents in Slack workspaces. Instead of relying on plain text interactions, users can now engage with agents through rich UI elements, making complex tasks simpler and more intuitive. This is particularly useful for tasks like data validation, task assignment, or survey collection, where interactive elements can guide the user through a structured process. The improved handling of interactive mode also ensures that these complex interactions are stable and reliable, even under heavy load.

Streamlined Container-Native CLI Workflows with the —container Flag

Running OpenClaw commands against a containerized instance previously required docker exec gymnastics or shell aliases. The new --container flag and OPENCLAW_CONTAINER environment variable let the CLI detect and execute commands inside running Docker or Podman containers automatically. If you run openclaw agent logs --container openclaw-prod-1, the CLI identifies the container runtime, executes the equivalent of docker exec openclaw-prod-1 openclaw agent logs, and formats the output for your local terminal. This works for all CLI subcommands including skills, config, and chat. The flag respects container namespaces and works with rootless Podman setups, making it ideal for CI/CD pipelines where OpenClaw runs in isolated environments but you need local debugging access, enhancing developer productivity.

This container-native CLI functionality is a significant convenience for anyone deploying OpenClaw in containerized environments. It simplifies debugging, configuration management, and general interaction with running OpenClaw instances, eliminating the need for cumbersome docker exec commands. The support for rootless Podman also ensures compatibility with modern container security practices, allowing developers to manage their OpenClaw containers with elevated security and isolation. This feature makes OpenClaw a more developer-friendly platform for cloud-native deployments.

Enhancing Discord Channels with Auto-Threading for Agent Conversations

Discord servers with active OpenClaw agents often face channel clutter as conversations interleave. The new autoThreadName configuration option for Discord channels automatically creates threaded discussions for each agent interaction. When a user mentions the agent or uses a slash command, OpenClaw creates a thread with a generated name based on the conversation starter, moves the interaction into that thread, and keeps subsequent replies contained. This prevents one long-running research task from pushing unrelated questions up the scrollback. You can configure naming templates like "{{user}}-research-{{timestamp}}" or let the agent generate a summary-based title after the first response. Threads auto-archive based on Discord’s retention settings but remain searchable, significantly improving chat organization.

This auto-threading feature is a game-changer for Discord communities that leverage OpenClaw agents for various tasks. It ensures that conversations remain organized and easy to follow, even in high-volume channels. The ability to customize thread naming provides flexibility, allowing server administrators to tailor the experience to their community’s preferences. By keeping agent interactions contained within threads, the main channel remains cleaner and more focused on general discussion, enhancing the overall user experience and making it easier for users to track specific agent-assisted tasks.

Deep Dive into OpenClaw’s Gateway Architecture: Translating OpenAI Calls

Understanding the translation layer is essential for debugging integration issues and optimizing performance. When a request hits /v1/chat/completions, OpenClaw’s gateway middleware first validates the JWT or API key provided in the request headers. After authentication, it inspects the model field in the request payload. If the model matches a configured provider (e.g., OpenAI, Anthropic, a local vLLM instance, or a custom service), the gateway routes the request to that specific backend. Crucially, it translates the OpenAI-standard request payload into the native format expected by the chosen inference server. For instance, OpenAI’s message format might be converted into a specific prompt template or API structure required by a local Llama model.

Response streaming uses Server-Sent Events (SSE) to match OpenAI’s behavior, chunking tokens as they arrive from the upstream provider and forwarding them to the client in real time. This ensures a consistent and responsive user experience. The /v1/embeddings path follows a similar pattern but often includes additional logic for batching requests for efficiency, especially when multiple inputs are provided for embedding generation. All routes support the Authorization: Bearer header scheme for drop-in SDK compatibility, making integration seamless. The gateway also handles error mapping, translating specific backend errors into standard OpenAI HTTP status codes and error messages, which helps existing client libraries handle failures gracefully.

Comprehensive Migration Guide: Upgrading from v2026.3.23 to beta.1

Upgrading to v2026.3.24-beta.1 requires attention to gateway configuration and skill manifests to ensure a smooth transition and leverage the new features.

First, update your OpenClaw installation. For CLI-based installations:

openclaw update beta

For Docker users, pull the new image:

docker pull openclaw/openclaw:2026.3.24-beta.1

Next, you must enable the new OpenAI compatibility layer in your gateway.yaml configuration file. Locate or add the openai_compat section under gateway:

gateway:
  openai_compat:
    enabled: true
    endpoints:
      - models
      - embeddings
      - chat
      - responses
    # Optional: Configure default models if explicit model is not provided by client
    default_chat_model: "openclaw-qwen-7b"
    default_embedding_model: "openclaw-bge-large-en"

If you manage custom skills, it is recommended to run openclaw skills migrate to generate install recipes for your existing configurations. This command helps OpenClaw understand the dependencies and requirements of your custom skills in the new framework. Check the Control UI for any skills marked “Needs Setup” and click through the new detail dialogs to configure API keys or other necessary parameters.

For Microsoft Teams deployments, a crucial step is to regenerate your bot manifest using the new Teams SDK schema. This manifest update is necessary to support the advanced UX patterns and improved integration. After updating the manifest, you will also need to update the Azure Bot Service messaging endpoint to /api/teams/v2/webhook to ensure messages are routed correctly to the new SDK-based integration.

Finally, before switching production traffic, thoroughly test your existing OpenAI SDK clients by changing only the base_url to your OpenClaw gateway’s address. This will confirm full compatibility and ensure that your applications function as expected with OpenClaw acting as the OpenAI API proxy. This staged approach minimizes disruption and allows for a confident transition.

Key Security Considerations for the New Gateway Endpoints

Exposing /v1/models and /v1/embeddings increases your attack surface, necessitating careful security planning, though OpenClaw implements several mitigations. The models endpoint reveals which LLMs you run, potentially aiding attackers in crafting model-specific prompt injection attacks or identifying vulnerable model versions. To mitigate this risk, consider enabling gateway.openai_compat.require_auth: true to block unauthenticated enumeration of your model catalog. This ensures that only authorized clients can discover your available models.

The embeddings endpoint could be abused for cryptomining or unauthorized vector generation if exposed to the internet without proper controls. Rate limiting is essential to prevent resource exhaustion and abuse: configure gateway.rate_limits.embeddings to restrict requests per IP address or API key. OpenClaw now supports separate API scopes for embeddings versus chat, allowing you to issue read-only keys for vector search while restricting completion access to more privileged keys. Review your Cross-Origin Resource Sharing (CORS) settings if the gateway faces browser-based clients to prevent unauthorized cross-origin embedding requests, ensuring that only trusted frontends can interact with your embedding service. Regular security audits and monitoring of access logs are also highly recommended.

Performance Benchmarks: Understanding Embedding Throughput and Latency

Early testing shows the new embedding endpoint handles high-throughput RAG indexing efficiently, demonstrating robust performance. On a server configured with 8 vCPUs and an NVIDIA A10G GPU, OpenClaw’s gateway processed a remarkable 4,200 embedding requests per minute when backed by a text-embedding-3-large model served via vLLM. Latency averaged 45ms per request for single inputs, and for batches of 100 text chunks, it maintained an average latency of 120ms. This indicates strong performance for both real-time queries and bulk indexing operations.

For CPU-only deployments using sentence-transformers models, the same hardware configuration achieved 380 requests per minute, which is respectable for environments without dedicated GPU resources. The gateway itself introduces minimal overhead: approximately 3-5ms for request translation and routing. For comparison, direct calls to the underlying embedding service without the OpenClaw gateway added only 1ms. This means the compatibility layer costs you almost nothing in performance while gaining significant integration flexibility and the benefits of a unified API. These benchmarks underscore OpenClaw’s efficiency as an AI gateway, making it suitable for demanding production RAG workloads.

Strategic Implications: OpenClaw’s Platform Strategy Refined

The v2026.3.24-beta.1 release marks a strategic pivot from an isolated agent runtime to a comprehensive AI infrastructure hub. By implementing OpenAI’s API surface, OpenClaw positions itself as a direct, drop-in replacement for organizations seeking to reduce their reliance on OpenAI’s ecosystem while preserving their existing toolchains and application logic. This move addresses a critical need for vendor independence and control over AI models and data.

The significant improvements to Teams and Slack integrations demonstrate a clear focus on enterprise messaging platforms, where AI agents must compete with or complement native Copilot experiences. These enhancements indicate OpenClaw’s commitment to providing a seamless and intuitive user experience within the tools that businesses already use daily. The introduction of one-click skill installation further lowers the barrier to entry for non-technical users, expanding the addressable market beyond just developers and AI specialists. This democratizes access to powerful AI capabilities within organizations.

These strategic moves suggest OpenClaw aims to become the Kubernetes of AI agents: an open-source control plane that sits intelligently between your diverse models and your applications. It offers a unified interface and management layer, regardless of which LLM provider you choose, whether it’s a proprietary cloud service, an open-source model hosted locally, or a custom fine-tuned variant. This vision positions OpenClaw as a critical component for building flexible, scalable, and resilient AI architectures in the modern enterprise.

Roadmap Ahead: When to Expect a Stable Release

The beta period for v2026.3.24 is expected to last approximately three weeks, based on previous release cycles and projected testing schedules. Critical path items blocking the stable release include rigorous stress testing of the embedding endpoint under various memory and concurrency pressures, as well as validating the Microsoft Teams SDK migration against a wide range of enterprise Azure AD configurations and security policies. The development team is also focusing on comprehensive end-to-end testing of the OpenAI compatibility layer with a broader set of client libraries and RAG frameworks.

If you are beginning to rely on the new OpenAI compatibility features in production or for critical development, it is advisable to pin your deployments to the specific beta tag (e.g., openclaw/openclaw:2026.3.24-beta.1) rather than using latest. This practice helps avoid any unexpected breaking changes that might occur during the Release Candidate (RC) phase as the team refines and hardens the features. The OpenClaw team typically publishes detailed migration scripts and guides for any significant breaking changes, but currently, nothing in this beta is marked as breaking in a way that would require extensive refactoring. Watch the official GitHub milestones for v2026.4.0-stable, which will likely absorb these features after a thorough hardening period and community feedback. Additionally, expect deprecation warnings for the legacy Teams webhook endpoints to be announced in the next minor release, encouraging users to transition to the new SDK integration.

Frequently Asked Questions

What OpenAI-compatible endpoints does OpenClaw v2026.3.24-beta.1 add?

This release significantly enhances OpenClaw’s compatibility with the OpenAI ecosystem by introducing two key endpoints: /v1/models for dynamic model discovery and /v1/embeddings for generating vector representations of text. Furthermore, it improves the existing /v1/chat/completions and /v1/responses endpoints by enabling the forwarding of explicit model overrides. This means OpenAI SDK clients can now specify models like gpt-4o or custom local equivalents without encountering gateway errors. These endpoints support standard authentication headers and return JSON schemas that are identical to OpenAI’s REST API, ensuring seamless integration with existing tools and libraries.

How does the new embedding endpoint help with RAG systems?

The /v1/embeddings endpoint is a cornerstone for modern RAG systems. It allows developers to generate high-quality embedding vectors using the exact same API shape as OpenAI. This means existing RAG frameworks such as LangChain, LlamaIndex, or even custom Python scripts can now integrate with OpenClaw without any code modifications. Users simply need to configure their OPENAI_BASE_URL to point to their OpenClaw gateway, and their retrieval pipeline will function as before. The endpoint efficiently handles batching of multiple inputs for embedding generation and returns detailed usage statistics, including token counts, which are compatible with OpenAI’s reporting, aiding in cost management and performance analysis.

Can I use the standard OpenAI Python SDK with OpenClaw now?

Absolutely. With the introduction of the new endpoints and improved model override forwarding, you can now configure the standard OpenAI Python SDK to interact directly with your OpenClaw instance. By setting the base_url to your OpenClaw gateway’s address, you can use familiar SDK calls like openai.ChatCompletion.create() or the Responses API. The OpenClaw gateway transparently translates these requests to your configured local or remote models while strictly maintaining OpenAI’s JSON schema for both requests and responses. Streaming responses are fully supported via Server-Sent Events (SSE), and error codes are mapped to standard OpenAI HTTP status codes, ensuring compatibility with existing error handling and retry logic in your applications.

What changed in the Microsoft Teams integration?

The Microsoft Teams integration in OpenClaw v2026.3.24-beta.1 has undergone a substantial migration to the official Microsoft Teams SDK, moving away from legacy webhook implementations. This upgrade unlocks a suite of AI-agent user experience (UX) best practices, including streaming 1:1 replies for reduced perceived latency, welcome cards with prompt starters to guide users, and robust feedback mechanisms. Other improvements include typing indicators to show agent activity, native AI labeling for transparency, and support for message editing and deletion with in-thread fallbacks. These changes require updating your Azure Bot Service configuration to utilize the new webhook endpoints provided by the Teams SDK, ensuring a modern, integrated, and compliant agent experience within Teams.

How do I upgrade to v2026.3.24-beta.1?

To upgrade to OpenClaw v2026.3.24-beta.1, you can execute openclaw update beta from your command-line interface. Alternatively, if you’re using Docker, pull the specific image openclaw/openclaw:2026.3.24-beta.1. After updating, it is crucial to review your gateway.yaml configuration file and ensure that new environment variables, particularly OPENCLAW_GATEWAY_OPENAI_COMPAT=true, are set to enable the new OpenAI compatibility features. Additionally, navigate to the Control UI to explore and configure the new skill installation workflows. For users with Microsoft Teams integrations, regenerating your bot manifest using the provided Teams SDK schema templates is a necessary step to leverage the enhanced features.

Conclusion

OpenClaw v2026.3.24-beta.1 adds OpenAI-compatible /v1/models and /v1/embeddings endpoints, enabling broader client and RAG integration for AI agents.