Where do you use AI / ASR, and where don't you?

Engagement letter discloses stage-by-stage. ASR permitted as first-pass working draft on non-confidential general content. ASR skipped entirely for confidential, privileged, or regulated content: counsel-engaged matters, regulatory interviews, board recordings, restricted ESG content. Named bench reviews and signs off every deliverable before final delivery. Raw AI output is never final product.

How does confidentiality work for live CART?

NDA from first email applies. For confidential content, ASR skipped entirely. For live CART, platform-side recording controls confirmed pre-engagement. Live CART text stream delivered to platform; not retained by Othello post-event unless client requests as deliverable.

Do you certify transcripts for court use?

Yes — certified transcripts available for court and arbitration submission. Signed declaration by named bench transcriber attesting to ≥99.5% accuracy floor with 2-pass bench review under chain-of-custody log. Suitable for Thai courts, TAI, THAC, ICC, SIAC tribunal submission. Thailand has no sworn-transcriber registry; certification on engagement-letter basis.

How does this integrate with broader Othello engagement?

Sits in interpretation column alongside five interpretation sub-modes (Simultaneous, Consecutive, Court & Legal, Remote, Whispered). Same NDA discipline, named bench, engagement-letter framework. Termbase continuity operationally built in across interpretation, transcription, and translation engagements. Recurring multi-column engagement reduces marginal cost and improves terminological consistency.

Engage →

Interpretation · Transcription & Captioning ISO/IEC 20071-23 · WCAG 2.2 · ISO 17100

Bangkok · GMT+7 Named bench reviews · not raw AI

Transcription · Captioning · CART · 5 deliverable types

Transcripts and captions —
institutional tier.

Audio-and-video to text deliverables — Thai ↔ English — across the five substantive transcript/caption forms: verbatim transcription for court, arbitration, and regulatory record; cleaned transcripts for board, AGM, audit committee, investor day; closed captions for pre-recorded video and conference recordings under WCAG 2.2 accessibility; translated subtitles for bilingual video deliverables; and live captioning / CART for real-time event accessibility. All operate under named in-house bench review against ISO/IEC 20071-23 and WCAG 2.2 accessibility floors, with AI-assisted first-pass disclosure baked into the engagement letter — Othello does not deliver raw machine transcripts as final product.

5 Deliverable
forms

99%+ Verbatim
accuracy floor

WCAG 2.2 caption
accessibility

Human Reviewed —
not raw AI

Scope a Transcript Engagement 5 Deliverable Types

Engagement Snapshot Named bench

Pair Thai ↔ English

Forms Verbatim · Cleaned · CC · Subs · CART

Standards ISO/IEC 20071-23 · WCAG 2.2

Translated ISO 17100 lockstep

AI-policy Disclosed · human-reviewed

Formats DOCX · SRT · WebVTT · TTML

NDA From first email

The bench that transcribes is the same bench that interprets and translates — termbase continuity across the engagement column. No anonymous-pool transcription, no raw-AI dump as final product, no undisclosed offshore vendor chain.

Section 01 · What it is

Five forms. One operating discipline.

Transcription and captioning cover the substantive production of text deliverables from audio and video source — five distinct deliverable forms, each with its own accuracy floor, time-coding discipline, and downstream use. The forms differ; the operating discipline does not. Named in-house bench review, NDA from first email, engagement-letter accountability, AI-disclosure transparency, termbase continuity with the interpretation and translation columns.

The five forms answer five distinct procurement questions. Verbatim transcription serves court, arbitration, deposition, and regulatory record purposes where every word — including “uh,” “um,” false starts, and overlapping speech — is preserved on the record. Cleaned transcripts serve board minutes, AGM record, audit committee documentation, and investor-day archives where the readable speaker intent matters more than the verbatim utterance. Closed captions deliver pre-recorded video accessibility under WCAG 2.2 and ISO/IEC 20071-23 — different formatting, timing, and reading-speed discipline from prose transcripts.

Translated subtitles bridge the transcription and translation columns: Thai-language source video with English subtitles, or English-language source with Thai subtitles, under ISO 17100 translation discipline combined with subtitling readability standards. Live captioning / CART (Communication Access Realtime Translation) delivers real-time captions during live events — hybrid AGM accessibility, conference accessibility, training programme accessibility — operating at simultaneous-grade cognitive load on the captioner.

Across all five forms, the substantive procurement question is accuracy floor, formatting discipline, and disclosure of AI assistance. The Othello operating discipline answers all three: named in-house bench reviews every deliverable before final delivery, AI-assisted first pass is disclosed in the engagement letter, accuracy floors are documented by deliverable type, and termbase continuity with interpretation and translation engagements is operationally built-in. The bench that transcribes a board meeting is often the bench that translated the prior quarter’s 56-1 One Report — and works from the same termbase.

What separates institutional-tier from commodity transcription

Where the procurement decision actually lives

Named in-house bench review — every deliverable reviewed by named bench · no anonymous transcription-pool intake · accountability through to founder
AI-assistance disclosure — engagement letter discloses where ASR first-pass is used and where human-only · no raw machine dump as final product
Termbase continuity — same bench, same termbase as interpretation and translation engagements · 56-1 One Report terms, IFRS S2 terms, regulatory terms all carry forward
Accuracy floor by form — verbatim ≥99.5% for legal · WCAG-floor for accessibility · documented in engagement letter
Format discipline — DOCX · SRT · WebVTT · TTML · JSON · client-format-of-record · delivery to platform integration
NDA-from-first-email — engagement-letter privilege regime · platform-side recording controls confirmed · no consumer-AI transcription tools

AI disclosure policy · operative discipline

Othello uses ASR (automatic speech recognition) as a first-pass tool for non-confidential general-content transcription where it accelerates throughput. The engagement letter discloses specifically where ASR is used and where the workflow is human-only. For confidential, privileged, or sensitive content — counsel-engaged matters, regulatory interviews, board recordings, M&A working sessions — the workflow operates human-only with no third-party AI tool ingestion. For content where ASR first-pass is used, named in-house bench reviews, corrects, and signs off on every deliverable before final delivery. Raw AI output is never the final product.

Section 02 · Standards stack

Accessibility and translation stacks. Independently verifiable.

Transcription and captioning sit at the intersection of two standards stacks — the accessibility stack (ISO/IEC 20071-23 for caption presentation, WCAG 2.2 Time-based media for web video, and broadcasting captioning conventions where applicable) and the translation stack (ISO 17100 for translated subtitles and bilingual transcripts). For live captioning at events, AIIC professional practice for the simultaneous-grade cognitive discipline overlays. The full stack is independently verifiable through the ISO Online Browsing Platform, W3C, and aiic.org.

ISO/IEC 20071-23 ISO/IEC · International
Organization for
Standardization · IEC

Visual presentation of audio information · captions on televisions

The substantive ISO/IEC standard governing visual presentation of caption content — font, size, positioning, contrast, reading speed, line length, and synchronisation with source audio. Originally framed for broadcast television but operationally the reference for any pre-recorded video captioning where accessibility floor matters. The procurement-grade caption presentation reference for institutional video deliverables.

Caption presentation Reading speed Line length Synchronisation

WCAG 2.2 W3C · World Wide
Web Consortium

Web Content Accessibility — time-based media

W3C Web Content Accessibility Guidelines 2.2, Section 1.2 Time-based Media, governs captioning of pre-recorded and live video on the web. Three substantive conformance levels: 1.2.1 (alt text equivalent for audio-only), 1.2.2 (captions for pre-recorded), 1.2.4 (live captions for live audio in synchronized media). The substantive accessibility procurement anchor for web-delivered video — SET-listed issuer investor relations video, multinational global town halls, capacity-building programme materials.

1.2 Time-based media Pre-recorded captions Live captions Web accessibility

ISO 17100:2015 ISO · International
Organization for
Standardization

Translation services — requirements

The ISO standard for translation services, applied to translated subtitle and bilingual transcript deliverables. Covers translator qualifications, revision discipline (independent second-translator review), terminological consistency, and quality management. For Othello, the same ISO 17100 lockstep that anchors the Technical Translation column applies to translated subtitles and bilingual transcripts — bench continuity built in, not bolted on.

Translation services Revision discipline Translated subtitles Bilingual transcripts

AIIC + CART AIIC · Communication
Access Realtime
Translation tradition

AIIC discipline · CART live-captioning practice

AIIC professional practice and the CART (Communication Access Realtime Translation) tradition govern live real-time captioning at events. The cognitive form is simultaneous-grade — the captioner listens, parses, and renders to written form continuously at the source language’s natural cadence. AIIC working-time limits apply; paired delivery is the standard for sessions of any length. Live captioning at AGMs and conferences operates at the same cognitive-load and working-time discipline as booth simultaneous.

Live CART Simultaneous-grade Paired delivery AIIC working times

Verification chain

ISO/IEC 20071-23 and ISO 17100 verifiable through the ISO Online Browsing Platform, WCAG 2.2 through w3.org/WAI/standards-guidelines/wcag/, and AIIC discipline through aiic.org. Caption file format standards (WebVTT, SRT, TTML, EBU-TT) are independently published by W3C and the European Broadcasting Union. Procurement evaluation panels can cross-reference every layer of Othello’s compliance claim against the published standards without going through the vendor. Independent verification at every standards layer.

Section 03 · Five deliverable forms

Pick the form. Lock the discipline.

The five deliverable forms below answer five substantively different procurement questions. Choosing the right form is the first scoping question — different accuracy floors, different time-coding discipline, different downstream platforms, different engagement-letter terms. The form drives everything else in the engagement.

The substantive distinctions are not merely formatting. Verbatim transcription demands preservation of every utterance including hesitations and false starts, because the on-record value is fidelity to what was actually said. Cleaned transcripts remove the hesitations to deliver readable speaker intent — substantively useful for board minutes and AGM record but operationally wrong for court use. Closed captions add time-coding discipline (reading speed, line length, on-screen positioning) absent from prose transcripts. Translated subtitles compound the timing discipline with cross-language rendering at ISO 17100 lockstep. Live captioning compounds all of the above with simultaneous-grade cognitive delivery.

For procurement: the engagement-letter specifies the form at scoping, and the form determines the bench skill set, accuracy floor, file format deliverables, and pricing tier. Form mismatches — asking for “a transcript” of a hearing when the use case is appellate record review — produce expensive corrective re-engagement.

Form 01

Verbatim transcription

Court · arbitration · regulatory · deposition record

True verbatim transcript — every utterance preserved including hesitations, false starts, interjections, overlapping speech, and speaker attributions. The substantive form for court hearings, arbitration tribunals, regulatory interviews, deposition records, and any setting where the on-record value is fidelity to what was actually said. Accuracy floor ≥99.5%; speaker attribution mandatory; time-stamping at speaker turns; certified transcript option available for court-record use.

Accuracy floor≥99.5%

Bench reviewMandatory · 2-pass

FormatDOCX · PDF certified

Use casesCourt · arbitration · regulatory

Form 02

Cleaned transcripts

Board · AGM · audit committee · investor day · minutes

Readable speaker-intent transcript — hesitations and false starts removed, speaker attributions preserved, content fidelity maintained. The substantive form for board minutes preparation, AGM record archives, audit committee documentation, investor-day archives, and corporate-governance record-keeping where readability matters more than verbatim utterance. Accuracy floor ≥98% on substantive content; speaker attribution preserved; section markers at agenda transitions.

Accuracy floor≥98% substantive

Bench reviewMandatory · 1-pass

FormatDOCX · with agenda markers

Use casesBoard · AGM · audit cttee · IR

Form 03

Closed captions (CC)

Pre-recorded video · WCAG 2.2 accessibility

Time-coded captions for pre-recorded video deliverables under WCAG 2.2 Time-based Media accessibility. The substantive form for SET-listed investor relations video, multinational global town halls, capacity-building programme materials, training and education video, podcast-and-interview video deliverables. ISO/IEC 20071-23 caption presentation discipline — reading speed ≤170 words per minute, max 32 characters per line, max 2 lines on-screen, synchronised to ≤500ms.

Reading speed≤170 wpm

Line length≤32 char · ≤2 lines

FormatSRT · WebVTT · TTML · EBU-TT

Use casesIR video · corporate · training

Form 04

Translated subtitles

Cross-language video · ISO 17100 lockstep

Translated subtitles for cross-language video deliverables — Thai-language source with English subtitles, English source with Thai subtitles, or multilingual subtitle tracks for broader distribution. Substantively a compound deliverable combining captioning timing discipline with ISO 17100 translation lockstep. Substantive form for SET-listed issuer investor videos for global audience, multinational training material localisation, government bilateral communications, capacity-building video materials.

TranslationISO 17100 lockstep

Timing≤170 wpm target lang

FormatSRT · WebVTT · TTML · multi-track

Use casesCross-language video · bilateral

Form 05

Live captioning · CART

Real-time event accessibility · Communication Access Realtime Translation · simultaneous-grade cognitive load

Real-time captioning during live events — hybrid AGM accessibility, conference live-captioning, training programme accessibility, multilateral hearing real-time text. The substantive form for events where accessibility floor requires synchronised text captions delivered live, or where deaf and hard-of-hearing participants are present. Cognitive form is simultaneous-grade — captioner listens, parses, and renders to written form at the source language’s natural cadence. AIIC working-time limits apply; paired delivery is the standard with rotation at agreed transitions. CART captioner is in many engagements working alongside the simultaneous interpreter, with shared briefing and termbase, in a single coordinated bench.

Cognitive formSimultaneous-grade

Working timePaired · ≤30 min rotation

OutputLive text stream · platform-delivered

Use casesHybrid AGM · conference · WCAG 1.2.4

Section 04 · Production pipeline

Six stages. Named bench at every one.

The substantive production pipeline runs from audio/video intake through final deliverable across six discrete stages. Named in-house bench operates at every stage — there is no point in the pipeline where the work is left to an anonymous platform or unsupervised AI. Stage 02 (ASR first-pass) is the optional acceleration layer for non-confidential content; for confidential content the workflow runs human-only throughout. The engagement letter discloses where stage 02 is used and where it is skipped.

Why pipeline transparency matters at procurement tier

Institutional clients procuring transcription and captioning are not buying a finished file — they are buying a chain-of-custody-traceable production process. The substantive procurement question is: what touched the audio, in what sequence, with what review, by whom, under what confidentiality regime. Vendors that obscure the production pipeline force the client to either trust on faith or run an audit-style due diligence. Othello discloses the pipeline upfront so neither is necessary.

The disclosure goes in the engagement letter. Stage-by-stage chain-of-custody, where ASR first-pass is applied and where it is not, who reviews at each stage (named bench identifiers), and what client-side audit access is available. This is operational discipline aligned with Big Four audit, Big Law privilege regime, and SEC/SET regulated content handling.

How the pipeline scales across deliverable forms

All five deliverable forms share the same six-stage pipeline structure — but the substantive work at each stage varies by form. Verbatim transcription emphasises stage 03 (named-bench review at 2-pass discipline) and stage 05 (QC against ≥99.5% floor). Closed captions emphasise stage 04 (time-coding and synchronisation discipline) and the reading-speed floor. Translated subtitles add an ISO 17100 lockstep revision layer between stages 03 and 04. Live CART compresses stages 01-05 into real-time delivery — the captioner operates all stages concurrently under simultaneous-grade cognitive load.

For procurement: the pipeline is form-agnostic; the engagement letter specifies the deliverable form, the form determines which stages get emphasised, the standards stack applies at the stage where it is operationally engaged.

Six-stage production flow — audio in, deliverable out

Each stage runs under named in-house bench. The chain-of-custody log is retained under engagement-letter discipline.

Audio/video intake + format check

Source media received under NDA · format compatibility verified (WAV / MP3 / MP4 / MOV / M4A / OGG / FLAC) · audio quality assessed (signal-to-noise, intelligibility, background noise) · speaker count enumerated · chain-of-custody log opened.

All forms

ASR first-pass · optional · disclosed

Where engagement-letter permits, ASR (automatic speech recognition) generates a first-pass machine transcript as working draft for bench review — never as final deliverable. Skipped entirely for confidential/privileged content · counsel-engaged matters · regulatory interviews · board recordings.

Disclosed · optional

Named-bench review + correction

Human bench reviews entire transcript against source audio · corrects errors, hesitation handling, false starts, speaker attribution · resolves overlapping speech and unclear utterances · 2-pass review for verbatim transcription (court / arbitration / regulatory) · 1-pass for cleaned transcripts.

Mandatory · named

Time-coding · segmentation · formatting

For captions and subtitles: time-coding to ≤500ms synchronisation, line-break discipline (≤32 char, ≤2 lines, ≤170 wpm), positioning markers. For transcripts: agenda-section markers, speaker-turn timestamps, certified-transcript formatting. ISO/IEC 20071-23 reference for caption presentation.

Form-specific

QC against accuracy floor

QC pass against documented accuracy floor (verbatim ≥99.5%, cleaned ≥98% substantive, captions WCAG 2.2 conformance, subtitles ISO 17100 revision discipline) · final spell-check, punctuation, formatting consistency · QC log entry retained in chain-of-custody.

All forms

Delivery · client format-of-record

Final deliverable in client-specified format — DOCX for transcripts (with optional PDF certified), SRT / WebVTT / TTML for captions, EBU-TT for broadcast, JSON for platform-integrated deployments. Termbase carry-forward committed to engagement record for continuity with subsequent engagement.

Client format

Format reference · what gets delivered

Client format-of-record at delivery stage

DOCX · PDF

Verbatim and cleaned transcripts. Speaker attribution, timestamps, agenda markers, certified-transcript PDF option.

SRT

SubRip subtitle. Universal subtitle format. Most video editing software and platforms accept SRT natively.

WebVTT

Web Video Text Tracks. W3C-published format for HTML5 video. WCAG 2.2 web-accessibility default.

TTML

Timed Text Markup Language. W3C/SMPTE-published format with advanced positioning and styling. Broadcast and OTT delivery.

EBU-TT

EBU Timed Text. European Broadcasting Union format for broadcast captioning. Used in European TV and broadcast workflows.

JSON

Platform-integrated. Structured time-coded captions for direct platform integration (LMS, CMS, video platform APIs).

SCC · STL

Broadcast legacy. SCC (US broadcast) and STL (EBU subtitling) for legacy broadcast pipelines where required.

CART live

Real-time text stream. Platform-delivered live text via StreamText, 1CapApp, Zoom captioning, or client-platform integration.

Section 05 · Accuracy + AI discipline

Documented floors. Disclosed AI use.

The substantive procurement discipline for transcription and captioning is (1) documented accuracy floors by deliverable form, and (2) transparent disclosure of where AI assistance is used in the pipeline. Both go in the engagement letter at scoping. The accuracy floor determines bench review depth and QC effort; the AI disclosure determines what content moves through ASR first-pass and what does not. Vendors that decline to specify either are not operating at institutional tier.

Why floors are documented, not assumed

The phrase “high-accuracy transcription” without a documented floor is operationally meaningless. What does it mean — 95%? 99%? 99.9%? Each level corresponds to substantively different bench effort, review discipline, and downstream usability. A 95% transcript can have one substantive error every 20 words; a 99.5% transcript can have one error every 200 words. For court-record purposes the difference is determinative; for podcast captioning the floor is suitable but for AGM minutes the floor matters less than the cleaned-presentation work.

Othello documents the accuracy floor in the engagement letter by deliverable form. Floors are not aspirational targets — they are contractual commitments verifiable by sampling QC at delivery. Client-side sampling QC at handover is welcome under engagement-letter terms.

Accuracy floor by deliverable form

Verbatim transcription

≥99.5%

Court · arbitration · regulatory · 2-pass bench review

Cleaned transcripts

≥98%

Board · AGM · audit cttee · IR · substantive content

Closed captions

WCAG 2.2

Accessibility conformance · presentation discipline · reading speed

Translated subtitles

ISO 17100

Translation lockstep · revision · terminological consistency

Live CART

≥96%

Real-time floor · paired delivery · NCRA-aligned discipline

AI assistance · where it operates, where it does not

The engagement-letter discloses ASR first-pass use stage-by-stage. Two operative principles below.

✓ ASR first-pass permitted

Non-confidential general content

Publicly-recorded eventsPublic conferences, published interviews, broadcast media transcripts
Marketing & communicationsPodcast captioning for marketing channels, public webinar transcripts
Training & educationAlready-public capacity-building programme materials, MOOCs, published academic lectures
Client-permitted general contentWhere the client engagement-letter explicitly permits ASR first-pass for the specific content tier

✗ Human-only · ASR skipped

Confidential · privileged · regulated

Counsel-engaged mattersArbitration recordings, deposition tapes, witness preparation, M&A working sessions under privilege
Regulatory contentSEC investigation interviews, SET disciplinary hearings, BoT regulatory bilateral, MFA Saranrom diplomatic
Board & audit recordingsBoard meeting recordings, audit committee deliberations, executive compensation discussions
Restricted ESG contentPre-disclosure 56-1 One Report drafts, IFRS S2 pre-release sustainability content, internal climate-risk deliberations

Verbatim discipline · what gets preserved, what gets cleaned

The verbatim-vs-cleaned distinction determines the bench discipline.

Verbatim preserves

Hesitations and filler — “uh,” “um,” “well,” “you know” all preserved on the record
False starts and self-corrections — “I think we should — no, actually we need to” preserved as said
Overlapping speech — speakers talking simultaneously, marked with attribution and overlap notation
Interjections and side comments — “[crosstalk],” “[inaudible],” “[unintelligible]” markers where preserved
Non-verbal annotations — “[pause],” “[laughter],” “[paper shuffle]” where contextually material
Repetitions — “the the company” preserved if that’s what was said
Speaker attribution — every utterance labelled with speaker identifier

Cleaned removes

Hesitations and filler — “uh,” “um,” “well” removed for readable flow
False starts — completed thought rendered, abandoned starts removed
Minor self-corrections — corrected version rendered, original removed
Non-substantive repetitions — “the the company” rendered as “the company”
Side comments — non-substantive interjections removed unless agenda-relevant
Paralinguistic noise — “[paper shuffle],” “[cough]” removed unless material
Preserved: substantive content, speaker attribution, agenda flow — readable speaker intent

Section 06 · Pre-engagement preparation

Briefing materials. Bench preparation. Termbase lock.

Preparation for transcription and captioning engagements parallels the interpretation column on glossary depth and termbase carry-forward, adds a substantive layer for source-audio quality assessment and turnaround scoping, and locks the deliverable form, accuracy floor, AI-disclosure policy, and final format at engagement-letter signing.

T−14 days

Scoping + briefing intake

For pre-recorded content: source media intake, deliverable form confirmed, accuracy floor documented, AI-disclosure policy locked. For live CART: agenda, participant bios, briefing materials. NDA from first email applies throughout.

T−10 days

Glossary simultaneous-grade build

Bilingual termbase drafted from briefing materials. For live CART: simultaneous-grade depth, same as Mode 01 booth simultaneous. For pre-recorded transcription / captioning: form-aligned glossary depth, sectoral terminology, named-entity index.

T−5 days

Glossary client review · locked

Termbase circulated for client preference confirmation. House-preferred terminology, regulatory choices, named-entity spellings. Locked at T-5. For multi-engagement clients on recurring transcription cycle, prior-engagement termbase carries forward.

T−2 days

Tech check · audio source · platform

For live CART: hub or qualified-home tech check per Remote setup discipline. For pre-recorded transcription: source-audio QC pre-pass, format compatibility confirmed, delivery platform verified.

T+0 −60 min

Live CART · final tech check

For live CART only: 60-minute pre-engagement final check — platform login, paired captioner coordination on rotation, last-mile glossary review, client briefing absorbed. For pre-recorded: stage 01 audio intake.

T+0 delivery

Active production

Pre-recorded: pipeline stages 01–06 sequenced under engagement-letter timeline. Live CART: real-time paired delivery under AIIC working-time discipline. Chain-of-custody log maintained throughout.

Section 07 · 6-phase methodology

Same six phases. Pipeline layer at Phase 05.

Othello operates transcription and captioning under the same 6-phase methodology applied across the interpretation modes, with substantive variation at Phase 03 (deliverable form confirmation replaces equipment site visit) and Phase 05 (the 6-stage production pipeline runs inside Phase 05 delivery).

Phase 01 · At scoping

Form selection + accuracy floor

Deliverable form confirmed against use case — see Section 03 five forms. Accuracy floor documented. AI-disclosure policy locked. Final delivery format specified (DOCX / SRT / WebVTT / TTML / EBU-TT / JSON).

Phase 02 · T−10 to T−5 days

Glossary build + termbase carry-forward

Bilingual termbase drafted from briefing materials at form-aligned depth. Prior-engagement termbase carry-forward for recurring institutional clients on multi-engagement cycle. Locked at T-5.

Phase 03 · T−5 to T−2 days

Bench assignment + source QC

Named bench assigned per deliverable form (verbatim specialists for legal · captioning specialists for accessibility · CART specialists for live · translation specialists for subtitles). Source-audio QC pre-pass for pre-recorded content.

Phase 04 · T−1 day

Final readiness

Last-mile briefing absorbed · glossary updates integrated · for live CART, paired captioner coordination on rotation and hand-over signalling · for pre-recorded, pipeline kick-off scheduled.

Phase 05 · T+0 delivery

Six-stage production pipeline

For pre-recorded: pipeline stages 01–06 sequenced. For live CART: real-time paired delivery under AIIC discipline with chain-of-custody log maintained. See Section 04 pipeline for stage detail.

Phase 06 · Post-delivery

Termbase carry-forward · QC log retained

Updated bilingual termbase committed under engagement-letter confidentiality. Chain-of-custody and QC log retained as operational continuity asset for recurring institutional engagement (quarterly board cycle · annual AGM · monthly audit committee · ongoing video content channel).

Section 08 · Use cases by sector

Where transcription & captioning actually deploys.

The use cases below map to engagement types substantively present in Othello’s institutional client roster — SET-listed Thai corporates, international Big Law arbitration teams, ESG advisory clients, capacity-building programmes, and US/UK/EU government bilateral. Each use case anchors a specific deliverable form and accuracy floor.

Use case 01 · Court & arbitration

Verbatim transcripts · TAI · THAC · ICC · SIAC

Verbatim transcription of arbitration hearings at TAI · THAC · ICC · SIAC, deposition records, witness sessions, and pre-arbitration counsel proceedings. ≥99.5% accuracy floor, 2-pass bench review, speaker attribution, certified-transcript PDF on counsel request. Mode-cross-reference: pairs with Mode 03 Court & Legal interpretation and Mode 04 Remote for hybrid hearings.

TAI · THAC ICC · SIAC Deposition Verbatim ≥99.5%

Use case 02 · AGM & board

Cleaned transcripts · SET-listed AGM · board minutes

Cleaned transcripts of SET-listed Thai issuer AGMs, board meetings, audit committee deliberations, and risk committee sessions — readable speaker-intent rendering for minutes preparation and corporate governance archive. ≥98% substantive accuracy, 1-pass bench review, agenda-section markers, speaker attribution preserved. Recurring engagement pattern: quarterly board cycle, annual AGM, monthly audit committee.

SET-listed AGM Board minutes Audit committee Quarterly cycle

Use case 03 · Investor relations video

Closed captions · SET issuer IR video

Closed-caption delivery for SET-listed issuer investor relations video archive — earnings call video, post-earnings briefing, ESG investor day video, IFRS S2 sustainability disclosure video. WCAG 2.2 conformance for global investor accessibility. Often paired with translated subtitles for cross-language IR deliverables (Thai source with English subs for global investor base).

SET IR video Earnings call ESG investor day IFRS S2

Use case 04 · Cross-language video

Translated subtitles · multinational comms

Translated subtitles for cross-language video deliverables — multinational corporate global town-hall video, Thai-language CEO message to global workforce, multinational training video localisation, government bilateral video communications. ISO 17100 translation lockstep combined with subtitling timing discipline. Multi-track subtitle delivery for broader linguistic distribution.

Global town hall Training localisation ISO 17100 Multi-track

Use case 05 · Hybrid AGM CART

Live CART · hybrid AGM accessibility

Live real-time captioning at hybrid AGM events for accessibility floor under WCAG 1.2.4 live captions in synchronized media. Paired CART captioners deliver simultaneous-grade text rendering with rotation at agenda transitions. Often co-deployed with Mode 01 booth simultaneous or Mode 04 RSI — coordinated bench shares briefing and termbase.

Hybrid AGM WCAG 1.2.4 live Paired CART Co-deployed interp

Use case 06 · Regulatory interviews

Verbatim · Thai SEC · BoT · MFA

Verbatim transcription of regulatory interviews and proceedings — Thai SEC investigation interviews, Bank of Thailand regulatory bilateral, MFA Saranrom diplomatic record, SET disciplinary proceedings, ministerial bilateral working sessions. Workflow runs human-only (no ASR first-pass) under engagement-letter privilege regime. ≥99.5% accuracy floor with regulatory-content-specific terminology lockstep.

Thai SEC Bank of Thailand MFA Saranrom Human-only

Use case 07 · ESG assurance

Cleaned transcripts · AA1000AS · ISO 14064 interviews

Transcription of ESG assurance interviews under AA1000AS, ISO 14064 climate inventory verification interviews, and CSDDD supply-chain due diligence interviews. Substantive content overlay — climate accounting (Panit-Thitaree pairing), sustainability reporting (Kittichai cross-anchor). Cross-engagement termbase continuity with the firm’s ESG advisory column deliverables.

AA1000AS ISO 14064 CSDDD interviews ESG termbase

Use case 08 · Training accessibility

Captions + subtitles · capacity-building

Closed-caption and translated-subtitle delivery for capacity-building programme video — GIZ programme deliverables, UK PACT training materials, UN Women programme video, World Bank technical assistance video, US State Department programme video. WCAG 2.2 accessibility conformance plus ISO 17100 translation lockstep for cross-language distribution. Often programme-deliverable specification in donor procurement.

GIZ programme UK PACT training UN Women World Bank TA

Section 09 · Institutional FAQ

Procurement-grade questions answered.

Substantive answers to the questions procurement evaluation panels, in-house counsel, IR teams, and accessibility programme managers ask when scoping transcription and captioning at institutional tier.

Q.01Which of the five forms do we need — verbatim, cleaned, captions, subtitles, or live CART?

Form selection follows from the substantive use case. Verbatim transcription for court / arbitration / regulatory record where every utterance matters on the record. Cleaned transcript for board minutes / AGM record / audit committee / IR archive where readable speaker intent matters more than fidelity to hesitations. Closed captions for pre-recorded video accessibility under WCAG 2.2. Translated subtitles for cross-language video distribution. Live CART for real-time event accessibility at AGMs, conferences, training.

The scoping question that resolves form selection: “what is the downstream use of this deliverable?” If the deliverable is part of a court record or arbitration tribunal submission, verbatim. If it’s for internal minutes or external IR archive, cleaned. If it’s a video on the company web site or LMS, closed captions. If it’s a video crossing language boundaries, translated subtitles. If it’s a live event with accessibility floor, CART. See Section 03 for full form detail.

Q.02Where exactly do you use AI / ASR, and where don’t you?

The engagement letter discloses this stage-by-stage. ASR (automatic speech recognition) is permitted as a first-pass working draft for bench review on non-confidential general content — public conferences, marketing communications, already-public training material, client-permitted general content. ASR is skipped entirely for confidential, privileged, or regulated content: counsel-engaged matters (arbitration, deposition, witness preparation, M&A under privilege), regulatory content (SEC investigation interviews, BoT bilateral, MFA Saranrom diplomatic record, SET disciplinary), board and audit recordings, restricted ESG content pre-disclosure.

Where ASR first-pass is used, named in-house bench reviews, corrects, and signs off on every deliverable before final delivery. Raw AI output is never the final product at Othello. The substantive procurement test: ask the vendor where AI is used, and require the answer in the engagement letter. Vendors that decline to specify are not operating at institutional tier. See Section 05 for the AI-discipline framework.

Q.03What accuracy floor do you guarantee, and how is it measured?

Accuracy floors are documented in the engagement letter by deliverable form: verbatim ≥99.5%, cleaned ≥98% substantive content, closed captions WCAG 2.2 conformance, translated subtitles ISO 17100 lockstep, live CART ≥96% real-time floor. These are contractual commitments, not aspirational targets, and are verifiable by client-side sampling QC at handover under engagement-letter terms.

Measurement is on a per-word substantive accuracy basis for transcripts, against WCAG 2.2 conformance criteria for captions, against ISO 17100 revision discipline for translated subtitles. Word Error Rate (WER) is the industry-standard measurement for transcription accuracy — Othello commits to the documented floor and supports client-side WER audit at handover. For live CART, the ≥96% real-time floor is calibrated to NCRA (National Court Reporters Association) live captioning practice. See Section 05 accuracy table.

Q.04How does confidentiality work — and what about platform-side recording for live CART?

NDA from first email applies throughout — same discipline as the interpretation and translation columns. For pre-recorded transcription, the engagement letter covers source-media chain-of-custody, AI-disclosure stage-by-stage, named-bench identifiers, and client-side audit access. For confidential / privileged content, ASR is skipped entirely (Section 05) — no third-party AI tool ingestion of source audio.

For live CART at events, platform-side recording controls are confirmed before the engagement opens — where the CART text stream is recorded, who has access, what happens to it after the event. For hybrid AGMs with WCAG 1.2.4 live caption requirement, the engagement letter typically specifies that the live CART text stream is delivered to the platform but is not retained by Othello post-event unless the client requests it as a deliverable. Counsel privilege regime applies where the CART is at a counsel-engaged matter (rare for CART but possible for arbitration with accessibility needs).

Q.05Can you deliver in our specific file format — SRT, WebVTT, TTML, EBU-TT, JSON?

Yes — pipeline stage 06 delivers in the client format-of-record. Standard caption file formats: SRT (universal subtitle, broadest compatibility), WebVTT (W3C HTML5 video, WCAG 2.2 web default), TTML (W3C/SMPTE advanced positioning and styling), EBU-TT (European Broadcasting Union broadcast format), SCC and STL (US broadcast / EBU subtitling legacy). Transcript formats: DOCX with formatting (speaker attribution, timestamps, agenda markers), PDF certified for court-record use.

Platform-integrated formats: JSON for direct platform API integration (LMS, CMS, video platform APIs). For live CART: real-time text stream delivered via StreamText, 1CapApp, Zoom captioning channel, MS Teams live captions API, Webex captions, or client-platform integration. The engagement letter specifies which format(s) the deliverable arrives in — multiple format delivery is supported at no operational overhead beyond the format-conversion stage.

Q.06How does pricing work — per word, per minute of audio, per hour of bench?

Engagement-letter basis — not consumer-grade per-minute auto-pricing. Cost drivers: deliverable form (verbatim is denser than cleaned; CART carries simultaneous-grade discipline), accuracy floor (≥99.5% requires 2-pass bench review), source audio quality (noisy or multi-speaker audio takes longer), language pair (Thai ↔ English at institutional tier vs general English-only), AI-disclosure policy (human-only is denser than ASR-assisted), delivery turnaround, and volume / continuity (recurring engagement cycles attract continuity pricing).

Standard turnaround: pre-recorded transcription at 3–5 business days from intake for verbatim, 2–3 business days for cleaned, captions and subtitles on parallel timeline. Expedited (next-business-day) and rush (same-day) available at engagement-letter terms. Live CART is scheduled per engagement. Volume discounts on recurring institutional engagement — quarterly board cycle, annual AGM, monthly audit committee, ongoing video content channel. See How We Quote for the substantive cost framework.

Q.07Do you certify transcripts for court or arbitration use?

Yes — certified transcripts available for court and arbitration submission. The certification is a signed declaration by the named bench transcriber, attesting under engagement-letter discipline that the transcript is a true and accurate verbatim record of the source audio to the documented ≥99.5% accuracy floor, with 2-pass bench review applied, under chain-of-custody log retained for the engagement record. Procurement-grade certification, suitable for tribunal submission and counsel use.

For Thai court hearings (Civil Court of Bangkok, Central IP & IT Court, Central Labour Court, Central Tax Court, Court of Appeal, Supreme Court / Sandika), the verbatim transcript is typically delivered alongside a separately-certified interpreter testimony under Mode 03 Court & Legal. For arbitration tribunals (TAI, THAC, ICC, SIAC), the certified verbatim transcript supports the tribunal’s procedural order on record-keeping. Thailand has no national sworn-interpreter or sworn-transcriber registry — Othello’s certification is on engagement-letter basis with named bench attestation, aligned with tribunal procedural requirements.

Q.08How do you handle Thai-language audio specifically — what are the operational challenges?

Thai-language transcription has substantive operational challenges that generic transcription vendors often handle poorly. Thai script has no word boundaries — segmentation is contextual, and ASR systems trained on Western languages frequently mis-segment. Thai tonal phonology — five lexical tones — means homophone disambiguation depends on context that surface-level ASR misses. Code-switching between Thai and English is pervasive in institutional Bangkok content (board meetings, AGMs, regulatory bilateral) and breaks single-language ASR pipelines.

Othello’s bench handles Thai-language audio with native-Thai bench review at every stage of the pipeline. For ASR-assisted workflows (where permitted), the first-pass machine output is treated as a starting point only, not a credible draft — substantive correction at stage 03 is operationally mandatory. For pure-Thai or heavily code-switched audio, the workflow runs human-only across all stages; ASR first-pass adds friction rather than acceleration. The engagement letter specifies which workflow applies. Bilingual native-Thai-and-native-English bench is the substantive operational answer to Thai-language audio at institutional tier.

Q.09How does this integrate with the broader Othello engagement?

Transcription and captioning sits in the interpretation column alongside the five interpretation sub-modes (Simultaneous, Consecutive, Court & Legal, Remote, Whispered), operating under the same NDA discipline, named in-house bench, and engagement-letter accountability. The bench that transcribes is the bench that interprets — termbase continuity is operationally built in.

Cross-column continuity: a SET-listed issuer client engaging Othello for quarterly investor day RSI interpretation + earnings call video captions + 56-1 One Report translation + IFRS S2 sustainability disclosure translation works from one termbase, one bench, one engagement-letter framework. Transcription and captioning amplifies the cross-column continuity rather than fragmenting it. Recurring institutional engagement on multi-column scope reduces marginal cost per deliverable and improves terminological consistency across the client’s external-facing language stack.

Q.10Can you accept our standing platform contract — Zoom recording, MS Teams, Webex, video CMS?

Yes — Othello operates platform-agnostically across Zoom recordings, MS Teams meeting recordings, Cisco Webex recordings, Google Meet recordings, dedicated event platforms (Interprefy, KUDO, Boostlingo Events), and client-side video CMS / LMS integrations. Source media intake at pipeline stage 01 handles any modern audio or video format. Delivery at stage 06 integrates with the client’s video CMS, learning management system, or platform API where the engagement letter specifies platform-integrated deliverable.

For live CART, the platform-side text-stream integration is confirmed pre-engagement: Zoom captioning channel, MS Teams live captions API, Webex captions integration, StreamText for general-platform delivery, 1CapApp for accessibility-focused events. Where the client has a non-standard platform, the technical check at T-2 days (Phase 04) confirms operational compatibility. See Contact Pathway 02 — Pre-RFP Scoping for a 30-minute scoping call to discuss platform integration.

Scope a transcription or captioning engagement

Pick a form. Lock the floor.

Transcription and captioning engagements scope most efficiently when the deliverable form, accuracy floor, AI-disclosure policy, and final delivery format are specified at engagement-letter signing. For recurring institutional cycles (quarterly board, monthly audit committee, ongoing video channel), continuity terms are negotiated upfront. NDA from first email applies throughout.

Pathway 01

RFP / Institutional Procurement

10-component capability brief under mutual NDA. Response 3–5 business days.

Pathway 02

Pre-RFP Scoping Call

30-minute scoping with named bench input. Call scheduled within 2 business days of NDA.

Pathway 03

Procurement Reference Request

Direct contact with reference contacts at named institutions under mutual NDA.

Pathway 04

Media / Careers / Client Support

Press, careers, existing client support per engagement-letter SLA.

Transcription & Captioning

Transcripts and captions —institutional tier.

Five forms. One operating discipline.

Accessibility and translation stacks. Independently verifiable.

Visual presentation of audio information · captions on televisions

Web Content Accessibility — time-based media

Translation services — requirements

AIIC discipline · CART live-captioning practice

Pick the form. Lock the discipline.

Verbatim transcription

Cleaned transcripts

Closed captions (CC)

Translated subtitles

Live captioning · CART

Six stages. Named bench at every one.

Why pipeline transparency matters at procurement tier

How the pipeline scales across deliverable forms

Six-stage production flow — audio in, deliverable out

Client format-of-record at delivery stage

Documented floors. Disclosed AI use.

Why floors are documented, not assumed

AI assistance · where it operates, where it does not

Non-confidential general content

Confidential · privileged · regulated

The verbatim-vs-cleaned distinction determines the bench discipline.

Verbatim preserves

Cleaned removes

Briefing materials. Bench preparation. Termbase lock.

Scoping + briefing intake

Glossary simultaneous-grade build

Glossary client review · locked

Tech check · audio source · platform

Live CART · final tech check

Active production

Same six phases. Pipeline layer at Phase 05.

Form selection + accuracy floor

Glossary build + termbase carry-forward

Bench assignment + source QC

Final readiness

Six-stage production pipeline

Termbase carry-forward · QC log retained

Where transcription & captioning actually deploys.

Verbatim transcripts · TAI · THAC · ICC · SIAC

Cleaned transcripts · SET-listed AGM · board minutes

Closed captions · SET issuer IR video

Translated subtitles · multinational comms

Live CART · hybrid AGM accessibility

Verbatim · Thai SEC · BoT · MFA

Cleaned transcripts · AA1000AS · ISO 14064 interviews

Captions + subtitles · capacity-building

Procurement-grade questions answered.

Pick a form. Lock the floor.

Transcripts and captions —
institutional tier.