AI inference overview

LayerRail AI inference gives a project a small set of model endpoints, API keys, and a playground for testing requests before you wire them into an application.

What it is for

Use AI inference when you want to:

Test text generation models.
Generate embeddings.
Build simple AI features without operating model servers.
Keep inference keys scoped to a LayerRail project.
Prototype prompts in the console.

Model catalog

The inference catalog shows available endpoints for the project. Each model card includes:

Field	Meaning
Model name	The model you pass in requests.
Provider	The system backing the endpoint.
Capability	Text generation or embeddings.
Context length	The supported input window.
Price	Input and output token pricing where configured.
URL	The endpoint URL to call from your application.

API keys

Create an inference API key before sending requests.

curl https://console.layerrail.com/ai/v1/chat/completions \
  -H "Authorization: Bearer $INFERENCE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "@cf/meta/llama-3.1-8b-instruct",
    "messages": [
      { "role": "user", "content": "Write a short deployment checklist." }
    ]
  }'

Quick start

Virtual machines

Managed PostgreSQL

Managed Kubernetes

Networking

GitHub runners

AI inference

API reference

Architecture

Security

About

What it is for

Model catalog

API keys

Next steps

Playground

API authentication

​What it is for

​Model catalog

​API keys

​Next steps

Playground

API authentication

What it is for

Model catalog

API keys

Next steps