Overview

Welcome to the Fluxions API. Our hosted endpoints cover three product surfaces:

Transcription — akro-v1, our listening model: speech-to-text, speaker diarization, and non-speech events (breaths, laughter, hesitations) in one call. Production-ready today.
Text-to-Speech — hosted VUI for expressive, low-latency TTS over HTTP or WebSocket. Live today — see Speech.
Realtime Voice — OpenAI Realtime-compatible WebSocket for end-to-end streaming voice conversations. Coming soon.

This page covers the basics that apply across all surfaces: authentication, base URL, and a health check.

Authentication

All API requests require authentication using an API key. Include your API key in the Authorization header:

curl "https://api.fluxions.ai/endpoint" \
  -H "Authorization: YOUR_API_KEY"

import requests

headers = {'Authorization': 'YOUR_API_KEY'}
response = requests.get('https://api.fluxions.ai/endpoint', headers=headers)
data = response.json()

const response = await fetch('https://api.fluxions.ai/endpoint', {
  headers: {'Authorization': 'YOUR_API_KEY'}
});
const data = await response.json();

Important: Do not use the "Bearer " prefix. Include the API key directly in the Authorization header.

Base URL

https://api.fluxions.ai

GET /health — Health Check

Check the API status and version information. No authentication required.

Request

curl "https://api.fluxions.ai/health"

import requests

response = requests.get('https://api.fluxions.ai/health')
data = response.json()
print(f"Status: {data['status']}, Model: {data['model']}")

const response = await fetch('https://api.fluxions.ai/health');
const data = await response.json();
console.log(`Status: ${data.status}, Model: ${data.model}`);

Response

{
  "status": "ok",
  "gateway": "api.fluxions.ai"
}

Transcription

Our akro-v1 model is a comprehensive listening model that performs:

Transcription — Convert speech to text with high accuracy
Speaker Diarization — Identify and separate different speakers ("who said what")
Non-Speech Detection — Capture breathing, laughter, hesitation, and other contextual sounds

This makes it ideal for transcribing meetings, interviews, podcasts, and any audio where understanding the full context matters.

All transcription endpoints require authentication — see Overview for API key setup.

Pricing: $0.20 per hour of audio processed, billed by the second. See pricing.

POST /submit — Submit Transcription

Submit audio for processing and receive a job ID immediately. Poll /transcriptions/{id} for results including transcription, speaker diarization, and non-speech events.

Parameters

Parameter	Type	Default	Description
`non_speech`	boolean	`false`	Include non-speech sounds
`filename`	string	`"audio"`	Name for the uploaded file
`cache`	boolean	`true`	Use cached results for identical files

Request

Body: raw audio file bytes.

curl -X POST "https://api.fluxions.ai/akro/submit" \
  -H "Authorization: YOUR_API_KEY" \
  -H "Content-Type: audio/mpeg" \
  --data-binary @audio.mp3

import requests

with open('audio.mp3', 'rb') as f:
    response = requests.post(
        'https://api.fluxions.ai/akro/submit',
        headers={'Authorization': 'YOUR_API_KEY'},
        data=f
    )

job = response.json()
job_id = job['id']

const formData = new FormData();
formData.append('file', audioFile);

const response = await fetch('https://api.fluxions.ai/akro/submit', {
  method: 'POST',
  headers: {'Authorization': 'YOUR_API_KEY'},
  body: formData
});

const job = await response.json();
const jobId = job.id;

Response

{
  "id": 124,
  "status": "submitted",
  "created_at": "2025-10-24T10:35:00.000Z",
  "original_audio_url": "https://...",
  "query_urls": {
    "get": "https://api.fluxions.ai/transcriptions/124",
    "status": "https://api.fluxions.ai/transcriptions/124"
  },
  "cached": false
}

Workflow

Submit audio via /submit and receive job ID
Poll /transcriptions/{id} to check status
When status is "completed", retrieve full results

GET /transcriptions/{id} — Get Transcription Results

Retrieve the full results for a specific job: transcription, speaker diarization, and non-speech events.

Parameters

Parameter	Type	Default	Description
`word_level_timestamps`	boolean	`false`	Include word-level timestamps in segments

Request

curl "https://api.fluxions.ai/transcriptions/124" \
  -H "Authorization: YOUR_API_KEY"

import requests

response = requests.get(
    'https://api.fluxions.ai/transcriptions/124',
    headers={'Authorization': 'YOUR_API_KEY'}
)

result = response.json()
if result['status'] == 'completed':
    print(result['text'])

const response = await fetch(
  'https://api.fluxions.ai/transcriptions/124',
  {
    headers: {'Authorization': 'YOUR_API_KEY'}
  }
);

const result = await response.json();
if (result.status === 'completed') {
  console.log(result.text);
}

Response

{
  "id": 124,
  "status": "completed",
  "created_at": "2025-10-24T10:35:00.000Z",
  "updated_at": "2025-10-24T10:35:20.000Z",
  "filename": "interview.mp3",
  "audio_duration": 300.0,
  "audio_format": "opus",
  "processing_time": 245.5,
  "language": "en",
  "non_speech": false,
  "num_chunks": 11,
  "num_segments": 25,
  "num_speakers": 2,
  "text": "SPEAKER_0: Yeah, let's actually start off exactly, where we initially began.\nSPEAKER_1: Sounds perfect. That makes complete sense to me.\nSPEAKER_0: So I started thinking about what if this is just a construct?",
  "segments": [
    {
      "speaker": "0",
      "text": "Yeah, let's actually start off exactly, where we initially began.",
      "start": 0.86,
      "end": 6.42,
      "segment_idx": 0
    },
    {
      "speaker": "1",
      "text": "Sounds perfect",
      "start": 6.0,
      "end": 7.2,
      "segment_idx": 0
    },
    {
      "speaker": "1",
      "text": "That makes complete sense to me.",
      "start": 7.5,
      "end": 9.8,
      "segment_idx": 1
    }
  ],
  "audio_url": "https://...r2.cloudflarestorage.com/...",
  "cached": true
}

Status Values

submitted — Job has been submitted
processing — Transcription in progress
completed — Transcription finished successfully
failed — Transcription failed (check error_message)

GET /transcriptions — List Transcriptions

List all transcriptions for your account.

Parameters

Parameter	Type	Default	Description
`limit`	integer	`50`	Number of results per page (max: 100)
`offset`	integer	`0`	Pagination offset

Request

curl "https://api.fluxions.ai/transcriptions?limit=10&offset=0" \
  -H "Authorization: YOUR_API_KEY"

import requests

response = requests.get(
    'https://api.fluxions.ai/transcriptions',
    headers={'Authorization': 'YOUR_API_KEY'},
    params={'limit': 10, 'offset': 0}
)

data = response.json()
print(f"Total: {data['total']}, Found: {len(data['transcriptions'])} transcriptions")
for t in data['transcriptions']:
    print(f"  ID {t['id']}: {t['filename']} - {t['status']}")

const response = await fetch(
  'https://api.fluxions.ai/transcriptions?limit=10&offset=0',
  {
    headers: {'Authorization': 'YOUR_API_KEY'}
  }
);

const data = await response.json();
console.log(`Total: ${data.total}, Found: ${data.transcriptions.length} transcriptions`);
data.transcriptions.forEach(t => {
  console.log(`  ID ${t.id}: ${t.filename} - ${t.status}`);
});

Response

{
  "total": 150,
  "limit": 10,
  "offset": 0,
  "transcriptions": [
    {
      "id": 150,
      "status": "completed",
      "created_at": "2025-10-24T10:40:00.000Z",
      "filename": "interview.mp3",
      "audio_duration": 1800.0,
      "audio_format": "opus",
      "processing_time": 45.2,
      "num_speakers": 2,
      "num_segments": 142,
      "original_audio_url": "https://...",
      "language": "en"
    }
  ]
}

Response Format

Text Field

The text field contains the full transcription with speaker labels and optional non-speech events:

Speaker Labels: SPEAKER_0:, SPEAKER_1:, etc. prefix each speaker's utterances
Line Breaks: Newlines (\n) separate different speaker turns
Non-speech Events: When enabled, events like [breath], [pause] appear inline

Example:

SPEAKER_0: Yeah, let's start [breath] where we began.
SPEAKER_1: Sounds good. That makes sense.
SPEAKER_0: So I was thinking about [pause] what if this is a construct?

Segments Array

The segments array provides precise timing and speaker information for each utterance:

speaker: Speaker ID as a string ("0", "1", etc.)
text: The spoken text for this segment (without non-speech events)
start: Start time in seconds (decimal precision)
end: End time in seconds (decimal precision)
segment_idx: Sequential index for this segment

Non-Speech Events

When non_speech=true, our listening model captures various non-speech sounds and events that provide additional context to the conversation.

Common Non-Speech Sounds

Event	Tag	Description	Example Usage
Breath	`[breath]`	Audible breathing sounds	`...end of sentence. [breath] Now this is important.`
Laugh	`[laugh]` or `hahaha`	Laughter - can be written as text or tagged for longer laughs	`Oh wow! hahaha [breath] that's hilarious.`
Hesitation	`[hesitation]` or `[hesitate]`	Unclear thinking noises or mouth sounds while pausing - not specific words	`Well [hesitation] um I'm not really sure.`
Pause	`[pause]`	Unnaturally long, noticeable pause (e.g., looking something up)	`Let me just uh... [pause] Let me look this up.`
Environment	`[env]`	Background noise or environmental sounds	`I was thinking [env] about what you said.`
Tut	`[tut]`	Tongue click or lip smack sound	`[tut] That's not quite right.`
Sigh	`[sigh]`	Expressive exhale sound	`[sigh] I suppose you're right.`
Sniff	`[sniff]`	Nasal inhale or sniffing sound	`[sniff] Something smells good in here.`
Cough	`[cough]`	Coughing sound	`Sorry, excuse me [cough] as I was saying...`

Usage Notes

Non-speech events are placed inline with the transcribed text
Events appear at their natural position in the conversation flow
Word elongation is marked with ellipsis: um... so... I think...
Emphasis on words uses asterisks: I *really* think so

Speech

Hosted VUI — expressive, low-latency text-to-speech. Send text, get back audio in a natural voice, with support for non-verbal cues like [sigh] and [laugh].

Two ways to render text:

HTTP (POST /v1/tts) — one request, one render. Simplest to integrate.
WebSocket (/v1/tts/ws) — keep a warm socket open across renders so each one skips the TLS/TCP handshake and reaches first audio sooner. Use this for interactive UIs.

Pricing: $10 per 1M characters (≈ $0.45 per hour of audio). See pricing.

Base URL

Speech is served through the unified Fluxions API gateway under the /vui namespace:

https://api.fluxions.ai/vui

Authentication

Built-in voices are public — no API key needed. A private voice you've cloned requires your credential in the Authorization header (Bearer <token>). See Voices below.

GET /voices — List Voices

List the built-in voices available to everyone. No authentication required.

Request

curl "https://api.fluxions.ai/vui/voices"

import requests

voices = requests.get('https://api.fluxions.ai/vui/voices').json()['voices']
for v in voices:
    print(v['voice_id'], '—', v['preview_text'][:50])

const { voices } = await fetch('https://api.fluxions.ai/vui/voices').then(r => r.json());
voices.forEach(v => console.log(v.voice_id, '—', v.preview_text.slice(0, 50)));

Response

{
  "voices": [
    { "voice_id": "maeve.h736bab09a", "preview_text": "I just, I want you to know how proud I am of you..." },
    { "voice_id": "abraham.h736bab09a", "preview_text": "I've finished analysing the document you uploaded..." },
    { "voice_id": "harry.h736bab09a", "preview_text": "Hello, this is Harry. I'm calling you..." }
  ]
}

Pass any voice_id as the voice field when rendering.

POST /v1/tts — Render (HTTP)

Synthesize speech from text. Returns a complete WAV by default, or streams audio chunk-by-chunk when stream=1.

Parameters

JSON body:

Parameter	Type	Default	Description
`voice`	string	(required)	A `voice_id` from `GET /voices`
`input`	string	(required)	Text to speak. Supports non-verbal cues (see below)
`temperature`	float	`0.9`	Sampling temperature — higher is more varied
`response_format`	string	`"wav"`	`"wav"` (complete file) or `"pcm"` (raw s16le @ 24 kHz)
`stream`	boolean	`false`	Stream audio as it's generated instead of buffering the whole file
`max_secs`	float	(auto)	Hard ceiling on output length. Auto-estimated from text length when omitted
`verify_chunks`	boolean	`true`	Re-checks each rendered chunk with a fast speech-to-text pass and re-renders any that misread the text. Improves reliability at the cost of latency. Set `false` for the lowest-latency stream (see Streaming)

Request

curl -X POST "https://api.fluxions.ai/vui/v1/tts" \
  -H "Content-Type: application/json" \
  -d '{"voice": "maeve.h736bab09a", "input": "[sigh] fine, I will say it one more time."}' \
  --output speech.wav

import requests

r = requests.post(
    'https://api.fluxions.ai/vui/v1/tts',
    json={'voice': 'maeve.h736bab09a', 'input': '[sigh] fine, I will say it one more time.'}
)
with open('speech.wav', 'wb') as f:
    f.write(r.content)

const r = await fetch('https://api.fluxions.ai/vui/v1/tts', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ voice: 'maeve.h736bab09a', input: '[sigh] fine, I will say it one more time.' })
});
const wav = await r.blob();
const url = URL.createObjectURL(wav);
new Audio(url).play();

Response

200 OK with the audio bytes. Content-Type is audio/wav (or audio/L16 when response_format is "pcm").

Streaming

Add stream=1 (query param or body field) to receive audio as it's generated, delivered as chunked transfer encoding.

By default (verify_chunks: true) each chunk is checked — and re-rendered if it misreads the text — before it streams, so the first audio lands once the first chunk is rendered and verified (~1 s for a typical sentence). Set verify_chunks: false to stream each chunk the instant the model produces it, unverified: first bytes then land within ~80 ms.

curl -X POST "https://api.fluxions.ai/vui/v1/tts?stream=1" \
  -H "Content-Type: application/json" \
  -d '{"voice": "maeve.h736bab09a", "input": "Streaming starts playing almost immediately."}' \
  --output speech.wav

import requests

with requests.post(
    'https://api.fluxions.ai/vui/v1/tts?stream=1',
    json={'voice': 'maeve.h736bab09a', 'input': 'Streaming starts playing almost immediately.'},
    stream=True,
) as r, open('speech.wav', 'wb') as f:
    for chunk in r.iter_content(chunk_size=8192):
        f.write(chunk)

const r = await fetch('https://api.fluxions.ai/vui/v1/tts?stream=1', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ voice: 'maeve.h736bab09a', input: 'Streaming starts playing almost immediately.' }),
});
const reader = r.body.getReader();
for (;;) {
  const { done, value } = await reader.read();
  if (done) break;
  // `value` is a Uint8Array chunk of the streaming WAV — append or play as it arrives
}

WebSocket /v1/tts/ws — Render (warm socket)

Identical render logic to POST /v1/tts, but the socket stays open between renders. Hold it open and the TLS/TCP/tunnel handshake is paid once — each subsequent speak goes straight to synthesis. Ideal for typing UIs or back-to-back lines.

Audio is delivered as binary frames of s16le PCM, mono, 24 kHz (no WAV header — assemble it yourself if you need a file).

Protocol

Client → server (text JSON):

{ "type": "speak", "voice": "<id>", "input": "<text>", "temperature": 0.9, "max_secs": 0, "verify_chunks": true, "token": "Bearer <jwt>" }
{ "type": "session.close" }

temperature, max_secs, and verify_chunks are optional. verify_chunks defaults to true; set it false for the lowest-latency stream (see Streaming).

Authentication. Built-in voices are public — omit token. A private cloned voice needs token set to the same value you'd put in the Authorization header: Bearer <clerk-jwt> for a signed-in session, or your raw API key. It rides in the speak message because browsers can't set headers on a WebSocket. The token is checked per speak, so you can mix public and private voices on one socket.

Server → client:

Message	Meaning
`{"type": "start"}`	The worker stream opened — audio frames follow
(binary frame)	A chunk of s16le PCM @ 24 kHz
`{"type": "done"}`	Current render finished — socket stays open for the next `speak`
`{"type": "error", "message": "..."}`	Render failed (socket stays open)

One render = one speak → start → binary PCM* → done. Send another speak on the same socket to render again.

Request

import asyncio, json, websockets

async def render(text, voice='maeve.h736bab09a'):
    pcm = bytearray()
    async with websockets.connect('wss://api.fluxions.ai/vui/v1/tts/ws') as ws:
        await ws.send(json.dumps({'type': 'speak', 'voice': voice, 'input': text}))
        async for msg in ws:
            if isinstance(msg, bytes):
                pcm += msg                      # s16le PCM @ 24 kHz
            elif json.loads(msg)['type'] == 'done':
                break
    return bytes(pcm)

audio = asyncio.run(render('[sigh] so you want to force me to say things.'))

const ws = new WebSocket('wss://api.fluxions.ai/vui/v1/tts/ws');
ws.binaryType = 'arraybuffer';

const chunks = [];
ws.addEventListener('open', () => {
  ws.send(JSON.stringify({ type: 'speak', voice: 'maeve.h736bab09a', input: '[laugh] oh, you are serious?' }));
});
ws.addEventListener('message', (ev) => {
  if (typeof ev.data === 'string') {
    const m = JSON.parse(ev.data);
    if (m.type === 'done') {
      // chunks now hold the full s16le PCM @ 24 kHz — feed to WebAudio or wrap in a WAV
      ws.send(JSON.stringify({ type: 'session.close' }));
    }
    return;
  }
  chunks.push(new Int16Array(ev.data)); // raw PCM frame
});

Non-Verbal Cues

Wrap a cue in square brackets inside input and the model renders it as an expressive sound rather than reading the word aloud:

Cue	Effect
`[sigh]`	Audible sigh
`[laugh]`	Laughter
`[gasp]`	Sharp intake of breath
`[sniff]`	Sniffle
`[cough]`	Cough
`[hesitate]`	Filler / thinking sound

Example: "[gasp] you did NOT just put pineapple on that pizza! [laugh] okay, okay."

Voices

Built-in voices (GET /voices) are public. You can also clone a custom voice from a short reference clip. Cloned voices are private to your account and require your Authorization token on every render — pass it as the Bearer <token> header for HTTP, or in the token field for the WebSocket.

POST /v1/voices — Clone a Voice

Upload a reference clip plus its transcript; the model encodes a private voice you can render with. Requires authentication. Sent as multipart/form-data.

Field	Type	Required	Description
`audio`	file	yes	Reference clip (`wav`/`opus`/etc.). A few clean seconds is enough. Max 25 MB
`text`	string	no	Exact transcript of the reference clip. Omit it and we transcribe the clip for you before cloning
`name`	string	no	Display label (defaults to the filename)

Leave text out and the server runs your clip through transcription automatically — so the simplest clone is just an audio file. Pass text yourself when you want exact control over the transcript.

curl -X POST "https://api.fluxions.ai/vui/v1/voices" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "audio=@reference.wav" \
  -F "text=This is exactly what the reference clip says." \
  -F "name=My Voice"

import requests

r = requests.post(
    'https://api.fluxions.ai/vui/v1/voices',
    headers={'Authorization': 'Bearer YOUR_TOKEN'},
    data={'text': 'This is exactly what the reference clip says.', 'name': 'My Voice'},
    files={'audio': ('reference.wav', open('reference.wav', 'rb'), 'audio/wav')},
)
voice_id = r.json()['voice_id']

const fd = new FormData();
fd.append('audio', fileInput.files[0]);
fd.append('text', 'This is exactly what the reference clip says.');
fd.append('name', 'My Voice');

const { voice_id } = await fetch('https://api.fluxions.ai/vui/v1/voices', {
  method: 'POST',
  headers: { Authorization: 'Bearer YOUR_TOKEN' },
  body: fd,
}).then(r => r.json());

Response: { "voice_id": "u-<user>-<hash>", "name": "My Voice", "frames": 173, "seconds": 13.8 }. Pass the returned voice_id as voice in any render call (with your token).

GET /v1/voices/mine — List Your Cloned Voices

curl "https://api.fluxions.ai/vui/v1/voices/mine" \
  -H "Authorization: Bearer YOUR_TOKEN"

import requests

voices = requests.get(
    'https://api.fluxions.ai/vui/v1/voices/mine',
    headers={'Authorization': 'Bearer YOUR_TOKEN'},
).json()['voices']

const { voices } = await fetch('https://api.fluxions.ai/vui/v1/voices/mine', {
  headers: { Authorization: 'Bearer YOUR_TOKEN' },
}).then(r => r.json());

Returns { "voices": [ { "voice_id": "u-...-ab12cd34", "name": "My Voice" } ] }.

POST /v1/voices/delete — Remove a Cloned Voice

curl -X POST "https://api.fluxions.ai/vui/v1/voices/delete" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"voice_id": "u-...-ab12cd34"}'

import requests

requests.post(
    'https://api.fluxions.ai/vui/v1/voices/delete',
    headers={'Authorization': 'Bearer YOUR_TOKEN'},
    json={'voice_id': 'u-...-ab12cd34'},
)

await fetch('https://api.fluxions.ai/vui/v1/voices/delete', {
  method: 'POST',
  headers: { Authorization: 'Bearer YOUR_TOKEN', 'Content-Type': 'application/json' },
  body: JSON.stringify({ voice_id: 'u-...-ab12cd34' }),
});

Note: cloned voices currently live in the running server's memory, not a database — they're tied to your account but are not guaranteed to survive a server restart. Re-upload if a voice_id stops resolving.

Output Format

Sample rate: 24,000 Hz
Channels: mono
Sample format: signed 16-bit little-endian PCM
HTTP wav: PCM wrapped in a standard WAV container
HTTP pcm / WebSocket binary frames: raw s16le PCM (no header)

History

The History API is one read-only surface over everything you've done on the platform — transcriptions, TTS renders, and voice conversations — under a single host. Use it to list, page, filter, and search your activity, and to fetch download links for the underlying audio and transcripts.

All history endpoints require authentication — see Overview for API key setup.

Base URL: https://api.fluxions.ai

One shape for everything

Every list response uses the same envelope:

{
  "object": "list",
  "page": 1,
  "limit": 20,
  "total": 137,
  "has_more": true,
  "data": [ /* items */ ]
}

Every item carries an object field telling you its type ("transcription", "tts", or "conversation") plus its native id. To fetch one item's detail, call /history/{type}s/{id} (e.g. /history/tts/123). Timestamps are ISO-8601 UTC; costs are in US dollars.

Shared query parameters

These work on every collection (and the unified feed):

Parameter	Type	Default	Description
`page`	integer	`1`	Page number (1-based)
`limit`	integer	`20`	Results per page (max: 100)
`order`	string	`desc`	Sort by time: `asc` or `desc`
`since`	string	—	Only items at/after this time (ISO-8601 or epoch seconds)
`until`	string	—	Only items at/before this time (ISO-8601 or epoch seconds)

Collection-specific filters: voice (tts, conversations), status (transcriptions), type (the unified feed).

GET /history — Unified Feed

A merged, reverse-chronological feed across all three types. Filter the streams with type (comma-separated).

Parameters

Parameter	Type	Default	Description
`type`	string	(all)	Restrict to `transcription`, `tts`, and/or `conversation` (csv)

(plus all shared parameters above)

Request

curl "https://api.fluxions.ai/history?limit=10&type=tts,conversation" \
  -H "Authorization: YOUR_API_KEY"

import requests

r = requests.get(
    'https://api.fluxions.ai/history',
    headers={'Authorization': 'YOUR_API_KEY'},
    params={'limit': 10, 'type': 'tts,conversation'},
)
for item in r.json()['data']:
    print(item['object'], item['id'], item['created_at'])

const res = await fetch('https://api.fluxions.ai/history?limit=10&type=tts,conversation', {
  headers: {'Authorization': 'YOUR_API_KEY'}
});
const { data } = await res.json();
data.forEach(i => console.log(i.object, i.id, i.created_at));

Response

{
  "object": "list",
  "page": 1, "limit": 10, "total": 84, "has_more": true,
  "data": [
    { "object": "conversation", "id": "sess_abc", "created_at": "2026-06-29T10:40:00Z",
      "cost_usd": null, "voice": "maeve.en-us", "duration_secs": 312.4, "turn_count": 18 },
    { "object": "tts", "id": 123, "created_at": "2026-06-29T10:32:00Z",
      "cost_usd": 0.0123, "voice": "maeve.en-us", "chars": 842, "audio_secs": 58.4 }
  ]
}

The feed is lightweight: it does not include presigned download_urls. Use the typed collection or detail endpoints to get them.

GET /history/transcriptions — Transcription History

List your transcriptions.

Parameters

Parameter	Type	Default	Description
`status`	string	—	Filter by status (e.g. `completed`)
`include_download_url`	boolean	`false`	Include a presigned audio URL per item

(plus all shared parameters)

Request

curl "https://api.fluxions.ai/history/transcriptions?status=completed&limit=5" \
  -H "Authorization: YOUR_API_KEY"

import requests

r = requests.get(
    'https://api.fluxions.ai/history/transcriptions',
    headers={'Authorization': 'YOUR_API_KEY'},
    params={'status': 'completed', 'limit': 5},
)
print(r.json()['total'], 'transcriptions')

const res = await fetch('https://api.fluxions.ai/history/transcriptions?status=completed&limit=5', {
  headers: {'Authorization': 'YOUR_API_KEY'}
});
console.log((await res.json()).total, 'transcriptions');

Response

{
  "object": "list",
  "page": 1, "limit": 5, "total": 42, "has_more": true,
  "data": [
    {
      "object": "transcription",
      "id": 456,
      "created_at": "2026-06-29T10:35:00Z",
      "cost_usd": 0.10,
      "status": "completed",
      "filename": "interview.mp3",
      "audio_duration_secs": 1800.0,
      "audio_format": "opus",
      "language": "en",
      "num_speakers": 2,
      "num_segments": 142
    }
  ]
}

GET /history/transcriptions/{id} — One Transcription

Returns the full record with presigned download_url (audio), text_url, and segments_url. 404 if it isn't yours.

curl "https://api.fluxions.ai/history/transcriptions/456" \
  -H "Authorization: YOUR_API_KEY"

GET /history/tts — TTS Render History

List your text-to-speech renders.

Parameters

Parameter	Type	Default	Description
`voice`	string	—	Filter by voice id
`include_download_url`	boolean	`true`	Include a presigned Opus URL per item

(plus all shared parameters)

Request

curl "https://api.fluxions.ai/history/tts?voice=maeve.en-us&limit=10" \
  -H "Authorization: YOUR_API_KEY"

import requests

r = requests.get(
    'https://api.fluxions.ai/history/tts',
    headers={'Authorization': 'YOUR_API_KEY'},
    params={'voice': 'maeve.en-us', 'limit': 10},
)
for render in r.json()['data']:
    print(render['id'], render['chars'], render['download_url'])

const res = await fetch('https://api.fluxions.ai/history/tts?voice=maeve.en-us&limit=10', {
  headers: {'Authorization': 'YOUR_API_KEY'}
});
const { data } = await res.json();
data.forEach(r => console.log(r.id, r.chars, r.download_url));

Response

{
  "object": "list",
  "page": 1, "limit": 10, "total": 60, "has_more": true,
  "data": [
    {
      "object": "tts",
      "id": 123,
      "created_at": "2026-06-29T10:32:00Z",
      "cost_usd": 0.0123,
      "voice": "maeve.en-us",
      "chars": 842,
      "audio_secs": 58.4,
      "download_url": "https://...r2.cloudflarestorage.com/...opus"
    }
  ]
}

GET /history/tts/{id} — One Render

Returns one render with a fresh signed download_url. 404 if it isn't yours.

curl "https://api.fluxions.ai/history/tts/123" \
  -H "Authorization: YOUR_API_KEY"

GET /history/conversations — Conversation History

List your voice conversations (agent calls).

Parameters

Parameter	Type	Default	Description
`voice`	string	—	Filter by voice id

(plus all shared parameters)

Request

curl "https://api.fluxions.ai/history/conversations?limit=10" \
  -H "Authorization: YOUR_API_KEY"

import requests

r = requests.get(
    'https://api.fluxions.ai/history/conversations',
    headers={'Authorization': 'YOUR_API_KEY'},
    params={'limit': 10},
)
for c in r.json()['data']:
    print(c['id'], c['turn_count'], c['duration_secs'])

const res = await fetch('https://api.fluxions.ai/history/conversations?limit=10', {
  headers: {'Authorization': 'YOUR_API_KEY'}
});
const { data } = await res.json();
data.forEach(c => console.log(c.id, c.turn_count, c.duration_secs));

Response

{
  "object": "list",
  "page": 1, "limit": 10, "total": 23, "has_more": true,
  "data": [
    {
      "object": "conversation",
      "id": "sess_abc123",
      "created_at": "2026-06-29T10:40:00Z",
      "cost_usd": null,
      "voice": "maeve.en-us",
      "started_at": "2026-06-29T10:40:00Z",
      "ended_at": "2026-06-29T10:45:12Z",
      "duration_secs": 312.4,
      "turn_count": 18
    }
  ]
}

GET /history/conversations/{id} — One Conversation

Returns the session plus its turn-by-turn transcript. 404 if it isn't yours.

Parameters

Parameter	Type	Default	Description
`include_turns`	boolean	`true`	Include the transcript turns
`include_tool_calls`	boolean	`false`	Include tool invocations (calendar, email, …)
`turns_limit`	integer	`500`	Max turns to return (max: 2000)

Request

curl "https://api.fluxions.ai/history/conversations/sess_abc123?include_tool_calls=true" \
  -H "Authorization: YOUR_API_KEY"

Response

{
  "object": "conversation",
  "id": "sess_abc123",
  "created_at": "2026-06-29T10:40:00Z",
  "voice": "maeve.en-us",
  "duration_secs": 312.4,
  "turn_count": 18,
  "turns": [
    { "object": "conversation_turn", "id": 9001, "session_id": "sess_abc123",
      "role": "user", "text": "What's on my calendar today?", "created_at": "2026-06-29T10:40:05Z" },
    { "object": "conversation_turn", "id": 9002, "session_id": "sess_abc123",
      "role": "assistant", "text": "You have two meetings...", "created_at": "2026-06-29T10:40:08Z" }
  ],
  "tool_calls": [
    { "object": "tool_call", "id": 51, "tool": "calendar",
      "args": {"range": "today"}, "result": "2 events", "created_at": "2026-06-29T10:40:07Z" }
  ]
}

GET /history/conversations/search — Search Turns

Full-text search across your conversation turns (Postgres websearch_to_tsquery).

Parameters

Parameter	Type	Default	Description
`q`	string	(required)	Search query
`limit`	integer	`20`	Max results (max: 100)

Request

curl "https://api.fluxions.ai/history/conversations/search?q=dentist+appointment" \
  -H "Authorization: YOUR_API_KEY"

Response

{
  "object": "list",
  "query": "dentist appointment",
  "data": [
    { "object": "conversation_turn", "id": 9100, "session_id": "sess_def456",
      "role": "user", "text": "remind me about the dentist appointment",
      "created_at": "2026-06-20T14:02:00Z" }
  ]
}

GET /history/search — Cross-Domain Search

Search across your whole history in one call. Conversation turns are matched by full text; transcriptions are matched by filename (their text lives in object storage, not the database). Results are type-tagged via object.

Parameters

Parameter	Type	Default	Description
`q`	string	(required)	Search query
`limit`	integer	`20`	Max results per domain (max: 100)

Request

curl "https://api.fluxions.ai/history/search?q=interview" \
  -H "Authorization: YOUR_API_KEY"

Response

{
  "object": "list",
  "query": "interview",
  "data": [
    { "object": "conversation_turn", "id": 9200, "session_id": "sess_ghi",
      "role": "assistant", "text": "...the interview went well...", "created_at": "2026-06-25T09:00:00Z" },
    { "object": "transcription", "id": 456, "created_at": "2026-06-29T10:35:00Z",
      "status": "completed", "filename": "interview.mp3", "audio_duration_secs": 1800.0 }
  ]
}