🔊 OpenAI Text-to-Speech Engram

Streaming Engram that converts text into PCM audio using OpenAI's realtime / text-to-speech APIs.

🌟 Highlights

Supports both chunked audio streaming and final synthesis summaries.
Keeps a simple runtime contract: text in, speech.audio.* events out.
Uses the same OpenAI secret shape as the chat and speech-to-text engrams.
Produces audio ready for downstream playback bridges such as LiveKit.

🚀 Quick Start

make lint
go test ./...
make docker-build

Apply Engram.yaml, mount an openai secret with API_KEY, and reference the template from your Story step.

⚙️ Configuration (`Engram.spec.with`)

Field	Type	Description	Default
`model`	string	OpenAI TTS model (e.g., `gpt-4o-mini-tts`).	`gpt-4o-mini-tts`
`voice`	string	Voice preset.	`alloy`
`format`	string	Output audio format (`pcm`, `opus`, `mp3`, `wav`, `flac`, `aac`).	`pcm`
`streamFormat`	string	OpenAI streaming mode (`audio` or `sse`).	`sse`
`sampleRate`	int	Output sample rate (Hz).	`48000`
`targetSampleRate`	int	Optional resampling target for PCM outputs (`0` keeps the native rate).	`0`
`channels`	int	Number of audio channels.	`1`
`speed`	number	Playback speed multiplier.	`1.0`
`instructions`	string	Optional style instructions for supported TTS models.	unset

🔐 Secrets

Secret openai must map to a Kubernetes secret containing API_KEY, with optional BASE_URL, ORG_ID, and PROJECT_ID keys.

📥 Inputs

{
  "text": "Hello world",
  "voice": "alloy",
  "model": "gpt-4o-mini-tts",
  "format": "pcm",
  "speed": 1.1,
  "instructions": "Warm, concise, and conversational."
}

Per-request overrides also support streamFormat, sampleRate, channels, speed, and instructions.

📤 Outputs

{
  "type": "speech.audio.v1",
  "audio": {
    "encoding": "pcm",
    "sampleRate": 48000,
    "channels": 1,
    "data": "<base64-encoded audio>"
  }
}

🔄 Streaming Mode

In streaming mode the engram emits:

Stream type	Description
`speech.audio.delta`	Base64/PCM chunks suitable for immediate playback.
`speech.audio.done.v1`	Final summary containing stream metadata. The current template/test surface still uses this event name even though Tractatus canonicalizes the shared summary type as `speech.audio.done`.

The generated audio can be passed directly to a playback bridge such as livekit-bridge.

🧪 Local Development

make lint – Run the shared lint and static-analysis checks.
go test ./... – Run the synthesis unit/integration tests.
make docker-build – Build the engram image for local clusters.

🤝 Community & Support

📄 License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github		.github
pkg		pkg
.gitignore		.gitignore
.golangci.yml		.golangci.yml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Engram.yaml		Engram.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
conformance_test.go		conformance_test.go
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔊 OpenAI Text-to-Speech Engram

🌟 Highlights

🚀 Quick Start

⚙️ Configuration (`Engram.spec.with`)

🔐 Secrets

📥 Inputs

📤 Outputs

🔄 Streaming Mode

🧪 Local Development

🤝 Community & Support

📄 License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🔊 OpenAI Text-to-Speech Engram

🌟 Highlights

🚀 Quick Start

⚙️ Configuration (Engram.spec.with)

🔐 Secrets

📥 Inputs

📤 Outputs

🔄 Streaming Mode

🧪 Local Development

🤝 Community & Support

📄 License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

⚙️ Configuration (`Engram.spec.with`)

Packages