API Quickstart

Go from API key to your first request in minutes. Follow step-by-step guides to integrate Ninja's AI models into your product — with code examples, SDKs, and everything you need to start building.

Registration

Sign up at Super.MyNinja.ai to start using our APIs.

You can sign up for free or subscribe to one of Ninja's paid plans. Ultra and Business give you higher limit access to the playground to experiment with our flagship, reasoning, and deep research LLMs,

When you’re ready to move from exploration to execution, purchase credits to start building AI products and experiences for coding, writing, and much more.

Purchase Credits

  • Sign up or login at Super.MyNinja.ai

  • Go to Settings and click Enable in the On-Demand section

  • Choose your desired amount (minimum purchase is $50 USD)

  • Click Confirm and Pay

Your credits onto your account instantly and ready to use.

Generate an API key

  • After purchasing your credits, go to the “Manage your API keys” section on the API keys & credits page.

  • Click the “Create a new key” button.

  • In the “Name” field, enter a name for your key (e.g., Production Key) and then click “Create Key”.

  • Once your key is generated, copy it and save it somewhere secure, as anyone with access can use it. If needed, you can regenerate a new key.

Using Ninja APIs

Over view

Base URL:
https://api.myninja.ai/v1

Authentication:
All requests require an API key passed in the header as follows where <token-key> is your auth token:
Authorization: Bearer <token-key>

Endpoint:
POST /chat/completions
This endpoint accepts chat requests and streams responses based on the specified model.

Models:
Currently available models include:
ninja-deep-research
ninja-super-agent:turbo
ninja-super-agent:apex
Ninja-super-agent:reasoning

These models can be selected by providing the model parameter in your request body.

Request Format

The Ninja API follows the OpenAI API spec closely. The request JSON object contains:

model (string):
The model identifier to use (e.g., "ninja-deep-research").

messages (array):
An array of message objects. Each object follows the format:

JSON

{ "role": "user", "content": "Your query here." }
  

stream (boolean):
When set to true, the API responds with a streaming response.

stream_options (object):
Additional options for streaming, such as "include_usage": true to include token usage details.

extra_headers (object):
Any extra headers required by the Ninja API (none at this point)

Request Example (json)

JSON

{
  "model": "ninja-super-agent:turbo",
  "messages": [
    {
      "role": "user",
      "content": "What is the significance of birefringence in materials science? Can you provide a detailed explanation including its applications and how it is measured?"
    }
  ],
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}
  

Response

The API returns a stream of ChatCompletionChunk objects. Each chunk may contain:

Content:
The text of the completion, which may include XML-wrapped reasoning steps.

Citations:
A list of URLs that support the reasoning provided.

Usage Information:
Token usage statistics if "include_usage": true was specified.

The client code can process these chunks to display a rich response that includes:

Reasoning Steps: Extracted and formatted from XML segments.

Citations: Displayed as a numbered list.

Usage Stats: Showing input and output tokens.

Sample Code Examples

Python

import time
import xml.etree.ElementTree as ET
from uuid import uuid4
from openai import OpenAI, Stream
from openai.types import CompletionUsage
from openai.types.chat import ChatCompletionChunk

# Initialize the OpenAI client with your API key and base URL.
client = OpenAI(api_key="PUT_YOUR_API_KEY_HERE", base_url="https://api.myninja.ai/v1")

def print_reasoning_step(reasoning_step: str):
    step = ET.fromstring(reasoning_step)
    print(f"[{step.find('header').text}]\n{step.find('content').text}\n")

def print_citations(citations: list[str]):
    width = len(str(len(citations)))
    print("[Citations]")
    for n, citation in enumerate(citations, 1):
        print(f"{n:{width}}. {citation}")
    print()

def print_content(content: str):
    print(content, end="", flush=True)

def print_usage(usage: CompletionUsage):
    print(
        f"\n\n| Input tokens: {usage.prompt_tokens:,} "
        f"| Output tokens: {usage.completion_tokens:,} ",
        end="",
        flush=True,
    )

def main():
    query = (
        "I am an advanced snowboarder and want a good all-mountain snowboard and new boots and "
        "I'd like to spend no more than $1000 all in. Also look at trends and reviews and give "
        "me options that are currently trendy and cool and have high reviews."
    )

    start_time = time.time()
    
    response: Stream[ChatCompletionChunk] = client.chat.completions.create(
        model="ninja-deep-research",
        messages=[
            {"role": "user", "content": query},
        ],
        stream=True,
        stream_options={
            "include_usage": True,
        },
    )

    for chunk in response:
        for choice in chunk.choices:
            if content := choice.delta.content:
                if content.startswith("<step>"):
                    print_reasoning_step(content)
                else:
                    print_content(content)

        if citations := chunk.model_extra.get("citations"):
            print_citations(citations)
        elif usage := chunk.usage:
            print_usage(usage)

    elapsed_time = time.time() - start_time
    print(f"| Elapsed: {elapsed_time:.1f} seconds |")

if __name__ == "__main__":
    main()
  
cURL

curl -N https://api.myninja.ai/v1/chat/completions \
  -H "Authorization: Bearer PUT_YOUR_API_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ninja-super-agent:turbo",
    "messages": [
      {
        "role": "user",
        "content": "What is the significance of birefringence in materials science? Can you provide a detailed explanation including its applications and how it is measured?"
      }
    ],
    "stream": true,
    "stream_options": {
      "include_usage": true
    }
  }'
Java

HttpResponse<String> response = Unirest.post("https://api.myninja.ai/v1/chat/completions")
  .header("Authorization", "Bearer PUT_YOUR_API_KEY_HERE")
  .header("Content-Type", "application/json")
  .body("{}")
  .asString();
  
JavaScript

const options = {
  method: 'POST',
  headers: {
    Authorization: 'Bearer PUT_YOUR_API_KEY_HERE',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: "ninja-super-agent:turbo",
    messages: [
      {
        role: "user",
        content: "What is the significance of birefringence in materials science? Can you provide a detailed explanation including its applications and how it is measured?"
      }
    ],
    stream: true,
    stream_options: {
      include_usage: true
    }
  })
};

fetch('https://api.myninja.ai/v1/chat/completions', options)
  .then(response => response.json())
  .then(data => console.log(data))
  .catch(err => console.error(err));
  
Go

package main

import (
  "bufio"
  "bytes"
  "fmt"
  "io"
  "log"
  "net/http"
)

func main() {
  url := "https://api.myninja.ai/v1/chat/completions"

  jsonStr := ` + "`" + `{
    "model": "ninja-super-agent:turbo",
    "messages": [
      {
        "role": "user",
        "content": "What is the significance of birefringence in materials science? Can you provide a detailed explanation including its applications and how it is measured?"
      }
    ],
    "stream": true,
    "stream_options": {
      "include_usage": true
    }
  }` + "`" + `

  req, err := http.NewRequest("POST", url, bytes.NewBuffer([]byte(jsonStr)))
  if err != nil {
    log.Fatalf("Error creating request: %v", err)
  }

  req.Header.Set("Authorization", "Bearer PUT_YOUR_API_KEY_HERE")
  req.Header.Set("Content-Type", "application/json")

  client := &http.Client{}
  resp, err := client.Do(req)
  if err != nil {
    log.Fatalf("Error sending request: %v", err)
  }
  defer resp.Body.Close()

  if resp.StatusCode != http.StatusOK {
    log.Fatalf("Request failed with status: %s", resp.Status)
  }

  reader := bufio.NewReader(resp.Body)
  for {
    line, err := reader.ReadBytes('\n')
    if err != nil {
      if err == io.EOF {
        break
      }
      log.Fatalf("Error reading response: %v", err)
    }
    fmt.Print(string(line))
  }
}
  
C#

using System;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

namespace NinjaApiExample
{
  class Program
  {
    static async Task Main(string[] args)
    {
      var url = "https://api.myninja.ai/v1/chat/completions";
      var apiKey = "PUT_YOUR_API_KEY_HERE";

      using var client = new HttpClient();
      client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
      client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));

      var payload = new
      {
        model = "ninja-super-agent:turbo",
        messages = new[]
        {
          new { role = "user", content = "What is the significance of birefringence in materials science? Can you provide a detailed explanation including its applications and how it is measured?" }
        },
        stream = true,
        stream_options = new { include_usage = true }
      };

      var jsonPayload = JsonSerializer.Serialize(payload);
      var content = new StringContent(jsonPayload, Encoding.UTF8, "application/json");

      try
      {
        var response = await client.PostAsync(url, content);
        response.EnsureSuccessStatusCode();

        var responseBody = await response.Content.ReadAsStringAsync();
        Console.WriteLine(responseBody);
      }
      catch (HttpRequestException e)
      {
        Console.WriteLine("Request error: " + e.Message);
      }
    }
  }
}
  

Pricing

Mode

Input price / per M tokens

Output price / per M tokens

Price / task

Qwen 3 Coder 480B (Cerebras)

Default for Ninja Cline AI Studio

$1.50

Standard mode

Balance for quality & speed
Default:
GLM 4.6
Deep Coder:
GLM 4.6

$1.00

Complex mode

Highest quality LLM for complex tasks
Default:
Sonnet 4.5
Deep Coder
Sonnet 4.5

$1.50

Fast mode

Fastest general agent in the world
Default:
GLM 4.6*
Deep Research:
Qwen Model Logo
Qwen3–235B*
Deep Coder:
GLM 4.6
*Powered by Cerebras
Cerebras logo

$1.50

Mode

Input price / per M tokens

Output price / per M tokens

Price / task

Qwen 3 Coder 480B (Cerebras)

Default for Ninja Cline AI Studio

$3.75

$3.75

Standard mode

Balance for quality & speed
Default:
GLM 4.6
Deep Coder:
GLM 4.6

$1.50

$1.50

Complex mode

Highest quality LLM for complex tasks
Default:
Sonnet 4.5
Deep Coder
Sonnet 4.5

$4.50

$22.50

Fast mode

Fastest general agent in the world
Default:
GLM 4.6*
Deep Research:
Qwen Model Logo
Qwen3–235B*
Deep Coder:
GLM 4.6
*Powered by Cerebras
Cerebras logo

$3.75

$3.75

Model

Input price / per M tokens

Output price / per M tokens

Turbo 1.0

$0.11

$0.42

Apex 1.0

$0.88

$7.00

Reasoning 2.0

$0.38

$1.53

Deep Research 2.0

$1.40

$5.60

Rate Limits

Ninja AI enforces rate limits on inference requests per model to ensure that developers are able to try the fastest inference.

Model

Request per minute (RPM)

Turbo 1.0

50

Apex 1.0

20

Reasoning 2.0

30

Deep Research 2.0

5

Build Your First App in Minutes

Describe the task. Ninja turns it into an app that runs step by step for you. No credit card required.

Ninja's SuperNinja interface showcasing the chat and tasks

FAQ

Frequently Asked Questions

Everything you need to know about Ninja API.

Try for Free

How can I increase my auto-pay thresholds and amount?

How can I cancel my auto-payment agreement?

How do I delete an API key?

How can I view the usage of each API request?