Skip to content

red-hat-data-services/caikit-tgis-serving

This branch is 27 commits ahead of, 7 commits behind opendatahub-io/caikit-tgis-serving:main.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

5f53061 · Mar 21, 2025
Mar 20, 2025
Jun 24, 2024
Jan 17, 2024
Oct 16, 2024
Jan 12, 2024
Oct 20, 2023
Oct 20, 2023
May 24, 2024
Nov 29, 2023
Apr 18, 2024
Jul 3, 2023
Feb 28, 2024
Sep 28, 2024
Jan 10, 2024
Mar 20, 2025
Mar 20, 2025

Repository files navigation

Caikit-TGIS-Serving

Caikit-TGIS-Serving is a stack that allows data scientists to perform Large Language Model (LLM) inference.

The Caikit-TGIS-Serving stack consists of these components:

  • Caikit: Caikit is an AI toolkit that enables users to manage models through a set of developer friendly APIs.
  • Caikit-nlp: Caikit module that handles Natural Language Processing (NLP) models and tasks.
  • Text Generation Inference Server (TGIS): Runtime that loads the models and provides the inference engine.
  • KServe: A Kubernetes Custom Resource Definition that orchestrates model serving for all types of models. It includes serving runtimes that implement the loading of given types of model servers. KServe handles the lifecycle of the deployment object, storage access, and networking setup.
  • Service Mesh (istio): The service mesh networking layer that manages traffic flows and enforces access policies.
  • Serverless (knative): A cloud-native development model that allows for serverless deployments of data models.

Architecture of the stack

KServe+Knative+Istio+Caikit_TGIS Diagram

Installation

The procedures for installing and deploying the Caikit-TGIS-Serving stack have been tested with Red Hat OpenShift Data Science self-managed on Red Hat OpenShift Service for AWS (ROSA) and OpenShift Dedicated clusters. They have not been tested with the OpenShift Data Science managed cloud service.

Prerequisites

  • To support inferencing, your cluster needs a node with 4 CPUs and 8 GB memory. You can adjust these settings in the spec.resources.requests section of the Serving Runtime custom resource.
  • You need cluster administrator permissions for many of the procedures (such as, installing operators, setting service-mesh configuration, and enabling http2).
  • You have installed the OpenShift CLI (oc).

Procedures

As of Red Hat OpenShift Data Science version 2.5.0, you can follow the official docs here for up-to-date installation instructions.

For RHODS<2.5.0 and ODH, there are two ways to install the KServe/Caikit/TGIS stack:

Demos

After you install the KServe/Caikit/TGIS stack, you can try these demos:

About

Fork of caikit-tgis-serving for KServe

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 44.0%
  • Shell 28.2%
  • Dockerfile 20.5%
  • Makefile 7.3%