Red Hat AI Inference on Amazon EKS: Exploring the Kubernetes resources
Look inside Red Hat AI Inference on Amazon EKS to understand its core architectural components and Kubernetes resources.
Look inside Red Hat AI Inference on Amazon EKS to understand its core architectural components and Kubernetes resources.
Discover how to use EvalHub and OCI persistence to make your AI evaluation results immutable, content-addressable, and fully auditable.
Explore the mechanics of gradient synchronization in PyTorch distributed training, focusing on MPI primitives like All-Reduce and core techniques like pipeline parallelism, tensor parallelism, and sharded data parallelism.
Learn how speculative decoding can improve the performance of large language models (LLMs) in production by using a small, fast model to generate tokens speculatively and a large model to verify them.
Learn how to use the EvalHub CLI to automate AI evaluations in your CI/CD pipelines. Install the SDK, configure profiles, and set up a production gate.
Learn how llm-d routes each inference request to the GPU that already has the relevant data cached, cutting down on time-to-first-token, and doubling throughput without changing hardware. Discover how Red Hat's stack packages this neatly into a single Kubernetes resource.
Learn how to onboard a custom evaluation framework into EvalHub using one class, one method, and a container image. This guide covers the contract, data structures, and a complete minimal adapter.
Headed to WeAreDevelopers World Congress Europe 2026? Visit the Red Hat Developer booth on-site to speak to our expert technologists.
Learn how to read an existing system collection, understand its threshold logic, and build your own collection that encodes your actual measurement strategy with thresholds that mean something.
Speculators v0.5.0 introduces DFlash support, enabling single-pass draft token generation with block diffusion for more efficient speculative decoding workflows. The release also adds unified online and offline training through vLLM’s native hidden states extraction system, improving training flexibility, version stability, and production readiness.
Red Hat and DeepLearning.AI have released a free hands-on course on the full LLM
Learn how to use Red Hat OpenShift AI's reusable components to build modular AI pipelines, speed up development, and focus on what differentiates your applications.
Learn how evaluation-driven development (EDD) turns AI optimization from an art into an engineering discipline with EvalHub.
Learn about LogAn, an open source tool designed to overcome the limitations of using LLMs to analyze massive volumes of production logs.
A Llama Stack-dependent backend, or any rapidly-evolving upstream project faces a version-drift problem. Explore our no-cost solution that provides early warnings.
Learn how an expert red-teamed an infrastructure using Red Hat AI, OpenClaw, and abliterated models on Red Hat OpenShift on IBM Cloud.
Learn how to transform a simple chatbot into an enterprise RAG application by applying metadata filtering, hybrid search, and neural reranking using the OGX framework in Red Hat OpenShift AI.
Learn how to prevent GPU waste and financial loss by implementing just-in-time (JIT) checkpointing with Kubeflow Training SDK on OpenShift AI.
Learn about the five primary structural challenges in enterprise AI evaluation and how EvalHub addresses them with a unified foundation for AI evaluation.
Learn how our team implemented CI/CD pipelines for the it-self-service-agent AI quickstart and the benefits of using CI/CD for agentic systems.
Learn how Red Hat AI can help address the security challenges of AI agents in production, from semantic malware to container escapes.
Scale agentic AI with Red Hat’s trusted software factory. Use Policy as Code and SBOMs to strengthen your development pipeline and manage software provenance.
Learn how Red Hat AI 3.4 uses EvalHub to orchestrate AI evaluations on Kubernetes. Scale frameworks like Garak and LightEval with built-in MLflow tracking.
Learn how Kagenti ADK, an open source toolkit, handles the complexities of managing production AI agents. It aligns with the Linux Foundation's Agent2Agent (A2A) protocol and provides a set of runtime services for easier deployment and operation.
Learn about our team's experience implementing a defense-in-depth safety architecture for AI agents using Llama Stack shields.