Srisin Transcriber
Transcribe Audio v3
Edit Transcription
Transcribe Audio
— Select Folder —
|
Select MP3...
Select SRT...
Load Files
🌙
● Local Server Connected
📁 File Browser
☰
⬆ Upload
🗑 Delete
⬇ Download
media
Uploading…
Drop files here to upload
Transcribe Audio
v3
📂 Stage 1: Input Preparation
MP3 File:
— Choose MP3 from media/ —
Upload via Google File API
?
ℹ️ Auto-Processing:
Audio duration, bitrate, sample rate auto-detected on file selection
Audio format validated before upload (MP3, size, integrity)
📐 Transcribe Time Range
Start:
to
End:
✂️ Divide Into Sections
Sections:
2
3
4
5
6
8
Overlap:
sec
Splits audio into equal-length segments, transcribes each sequentially, then concatenates SRT files.
🧠 Stage 2: Model & Prompt
Model:
?
Gemini 3 Flash
Gemini 3 Pro
Gemini 3.1 Pro
Prompt Template:
?
— Loading templates… —
Use System Instructions
?
Transcription Prompt:
?
Max Output Tokens:
?
Thinking Mode:
?
Off
Low (1K)
Medium (8K)
High (32K)
Custom…
Custom Budget (tokens):
Generation Parameters
Temperature:
0.2
?
Top-K:
40
?
Top-P:
0.95
?
⚡ Stage 3: Execution
Request Mode:
?
📡 Streaming (Live Preview)
⏳ Synchronous (Standard)
⏱ Sync + Extended Timeout
📦 Batch (Inline / Sync)
Retry on Failure
?
Max retries:
Pre-Call Token Count
?
API Key:
checking…
ℹ️ Auto-Processing:
Defensive JSON parsing on all Gemini responses (markdown fences, trailing commas handled)
SSE keep-alive heartbeat every 15s (prevents proxy timeout)
File API upload state verification (polls until ACTIVE, retries on FAILED)
Structured payload factory normalizes all API parameters
✅ Stage 4: Verification
Completeness Check
?
Auto-Continue if Incomplete
?
Max passes:
Timing Validation
?
Thinking Token Warning
?
ℹ️ Auto-Processing:
Thinking token analysis — warns if thinking tokens exceed output tokens
SRT file saved with cost calculation and output statistics
API communication log (request/response pairs) saved to request folder
▶ Start Transcription
Estimated remaining: calculating...
📋 Batch Queue
Clear
▶ Start Batch
📊 Status Dashboard
🔄 Continuation Progress
📡 Live Stream
📊 Transcription Report
📁 File Browser
☰
⬆ Upload
🗑 Delete
media
Uploading…
Transcribe Audio
1. Select MP3 File:
— Choose MP3 from media/ —
2. Transcription Instruction Prompt:
You are an expert Thai linguist and transcriptionist. Please transcribe the attached Thai audio file perfectly verbatim. Write down every single spoken word exactly as it is said, including filler words, casual speech, and local terms. Do not summarize or skip any parts of the audio. Transcribe this Thai audio file into SRT format. Separate Each speaker into individual labels such as Speaker
CRITICAL RULE: As you transcribe, self-evaluate your accuracy. If there is any word, name, or phrase where you are unsure or your confidence is less than 80%, you MUST enclose that specific word or phrase in double quotation marks (e.g., ฉันคิดว่าเราควรไปที่ "พารากอน" ในวันพรุ่งนี้)
3. Choose Google AI Language Model:
Gemini 3 Pro (gemini-3-pro-preview)
Gemini 3 Flash (gemini-3-flash-preview)
Gemini 3.1 Pro (gemini-3.1-pro-preview)
4. Request Mode:
⏳ Synchronous (Standard)
📡 Streaming (Live Preview)
⏱ Sync + Extended Timeout
📦 Batch (Inline / Sync)
Upload via Google File API (recommended for reliability)
▶ Start Transcription
📊 Status Dashboard
📡 Live Stream
📊 Transcription Report
Original:
None
⬇
🔍
🚩
<
>
0/0
↑
↓
Corrections
✕
Replace
Replace All
Corrections:
None
⬇
⬆ Merge
📄 Text
📝 DOCX
📊 CSV
<
>
00:00:00,000
Speakers
✕
-- Saved Maps --
Apply
Save
🗑
👤 Speakers
⌨
✕
⌨ Keyboard Shortcuts