Glossary

AI chat glossary

Plain-language definitions of the terms used across FanChat and the AI character chat space.

BYOA (Bring Your Own API key)

BYOA is a pricing model where the platform is free and the user supplies their own LLM provider API key. Inference is billed by the provider (OpenAI, Anthropic, Google, etc.) directly to the user at cost, with no platform markup and no subscription. FanChat is a BYOA platform.

BYOK (Bring Your Own Key)

BYOK is a synonym of BYOA used interchangeably in the AI tooling space: the user brings their own API key instead of paying the platform for hosted inference.

AI character

An AI character is a reusable conversation template: a system prompt defining personality, example dialogues defining response style, an optional knowledge base, mood and voice settings. The character itself runs no inference; the user who chats with it activates it with their own model and tokens.

System prompt

A system prompt is the hidden instruction sent to the LLM before the conversation, defining who the character is, how it speaks, and what it must not do. On FanChat it is generated by the character builder and can be edited manually (up to 8000 characters).

Example dialogues

Example dialogues are 6-10 sample exchanges stored with a character that teach the model its unique response style: tone, length, quirks, vocabulary. They matter more than the description for how a character actually sounds.

Local LLM

A local LLM is a language model that runs on the user’s own hardware instead of a cloud API. In a browser context, the model file is downloaded once and inference happens on the device GPU, so no conversation data leaves the machine and no API key is needed.

WebGPU inference

WebGPU inference is running a neural network inside a web browser using the WebGPU graphics API to access the GPU. FanChat uses it to run Llama 3.2 1B (ONNX, q4f16 quantization) in a Web Worker, streaming tokens with zero network traffic.

Quantization (q4f16)

Quantization compresses a model’s weights to lower precision so it fits in less memory and runs faster. q4f16 stores weights in 4-bit precision with 16-bit float activations, shrinking a 1B-parameter model to roughly 1 GB so it can run on consumer GPUs in the browser.

WebMCP

WebMCP is a draft browser API (navigator.modelContext) that lets a website register tools that local AI agents can call. FanChat exposes 11 tools (navigation, reading, writing) so a browser agent can search characters, send messages, or export conversations.

Function calling (tools)

Function calling lets an LLM invoke external tools mid-conversation: web search, URL reading, a calculator, code execution, image generation, or domain tools like flight search. The model decides when to call a tool and incorporates the result into its reply.

Platform LLM

On a BYOA platform, the platform LLM is the model the platform itself pays for to power internal features (such as guiding character creation or generating descriptions), as opposed to the chat LLM that runs user conversations on the user’s own key.

Neural TTS

Neural text-to-speech synthesizes natural-sounding speech from text using neural networks. FanChat offers 27 neural voices across 13 languages so any character message can be read aloud, with voices filtered by the character’s language.

See it in practice: FAQ, Local LLM, WebMCP.