From 2bd6ee1567bf3fd386d07c0ecd569c93c500d49e Mon Sep 17 00:00:00 2001 From: admin Date: Fri, 19 Jun 2026 17:46:54 +0000 Subject: [PATCH] Update README with new features (MCP API, watch-folder, improved summaries, DOCX styling, cover pages) --- README.md | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 84 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index cdf068d..4c0c4fc 100644 --- a/README.md +++ b/README.md @@ -7,6 +7,8 @@ ScrAIbe is a transcription and summarization service that: - Provides: - A web GUI for uploading audio and receiving transcripts via email. - A CLI and Python API for direct integration. + - An MCP-style HTTP API (OpenAPI) for LLMs and external systems. + - A watch-folder mode for automatic transcription, summarization, and email delivery. No local speech models or heavy dependencies are required. ScrAIbe is designed as a thin client in front of your own AI services. @@ -24,7 +26,8 @@ For more information: https://apstrom.ca - Key decisions and outcomes - Action items and responsibilities - Open issues and risks -- Async web GUI: + - Improved, configurable summary prompts (via environment or file). +- Async web GUI (always enabled): - Upload audio via browser. - Jobs are queued and processed in the background (Celery + Redis). - Emails: @@ -32,13 +35,32 @@ For more information: https://apstrom.ca - Final transcript (MD + DOCX + JSON) when ready. - Summary as MD + DOCX (if requested). - Error notification if processing fails. +- MCP-style HTTP API (optional): + - Exposes an OpenAPI-compliant REST endpoint for external LLMs or services. + - Allows: + - Audio upload for transcription. + - Job status checks. + - Retrieval of transcript JSON (no summary). + - Enabled via MCP_SERVER_ENABLED=true. +- Watch-folder mode (optional): + - Monitors a directory for audio files. + - For each file: + - Transcribes and summarizes. + - Emails transcript + summary + JSON to a configured address. + - Deletes the source file after successful processing (configurable). + - Enabled via WATCH_ENABLED=true. - File formats: - - Transcript: .md and .docx (line-numbered, no cover page) - - Summary (if requested): .md and .docx (no line numbering, no cover page) + - Transcript: + - .md + - .docx (line-numbered, 30 lines per page, optional cover page) + - Summary (if requested): + - .md + - .docx (markdown-aware WYSIWYG styling, optional cover page) - Full structured output: .json - Customizable branding: - Web GUI title, logo, and accent color via environment variables. - Email logo, accent color, and subject lines via environment variables. + - Optional cover pages for transcript and summary DOCX. - CLI and Python API: - Simple command-line interface. - Drop-in Scraibe class for integration into other tools. @@ -58,7 +80,9 @@ For more information: https://apstrom.ca - Chunked summarization - Output formatting (e.g., .md with transcript + summary) - Runs: - - Web GUI (Gradio) + - Web GUI (Gradio) – always enabled + - MCP-style HTTP API (FastAPI) – optional + - Watch-folder mode – optional - Celery worker (async processing) - Redis (in-container by default) @@ -209,6 +233,33 @@ Accent color (UI and emails): - Email headings, links, and email addresses - Default: #7C6DA0 +MCP-style HTTP API: + +- MCP_SERVER_ENABLED: + - Enable MCP-style HTTP API (default: false). + - Values: true/false. +- MCP_SERVER_HOST: + - Bind address (default: 0.0.0.0). +- MCP_SERVER_PORT: + - Port (default: 8000). +- MCP_USE_CELERY: + - Use Celery for async transcription (default: true). + - If false, transcription runs in-process. + +Watch-folder mode: + +- WATCH_ENABLED: + - Enable watch-folder mode (default: false). + - Values: true/false. +- WATCH_DIR: + - Directory to monitor for audio files (required if WATCH_ENABLED=true). +- WATCH_EMAIL_TO: + - Email address to send transcript and summary (required if WATCH_ENABLED=true). +- WATCH_POLL_INTERVAL: + - Seconds between scans (default: 10). +- WATCH_DELETE_ON_SUCCESS: + - Delete source file after successful processing (default: true). + Async processing (Celery + Redis): - CELERY_BROKER_URL: @@ -253,16 +304,40 @@ Email subject lines (customizable): - Subject for error notification email. - Default: "ScrAIbe: Error with your transcription request" -Output files (async web GUI): +Summary prompt customization: + +- SUMMARY_PROMPT_CHUNK: + - Override prompt used for each transcript chunk. +- SUMMARY_PROMPT_COMBINED: + - Override prompt used for the final combined summary. +- SUMMARY_PROMPT_FILE: + - Path to a file with prompts in sections: + - [chunk] + - [combined] + +DOCX and cover pages: + +- COVER_PAGE_ENABLED: + - Add a cover page to transcript and summary DOCX files (default: false). +- COVER_PAGE_ORGANIZATION: + - Organization name shown on the cover page. +- COVER_PAGE_TITLE_PREFIX: + - Title prefix (e.g., "TRANSCRIPT" or "SUMMARY"). +- COVER_PAGE_LOGO_URL: + - Logo URL to include on the cover page. +- COVER_PAGE_LOGO_PATH: + - Local logo path to include on the cover page. + +Output files (async web GUI and watch-folder mode): When a job completes, the user receives: - Transcript: - .md file - - .docx file (line-numbered, no cover page) + - .docx file (line-numbered, 30 lines per page, optional cover page) - Summary (if requested): - .md file - - .docx file (no line numbering, no cover page) + - .docx file (markdown-aware styling, optional cover page) - JSON: - Structured transcript with diarization and metadata @@ -280,6 +355,8 @@ Core runtime dependencies: - celery[redis] - redis - python-docx +- fastapi +- uvicorn - ffmpeg (for audio preprocessing) No local Whisper, PyTorch, or Pyannote models are required.