Files
mcp-summary/README.md
T

2.0 KiB

MCP Summary Server

An MCP (Model Context Protocol) server for document summarization that keeps full text out of the chat context window.

Features

  • Automatically determines whether to summarize directly or use chunked summarization
  • All processing happens server-side
  • Returns only the summary to the client
  • Configurable chunking parameters
  • Bearer token authentication (optional)

Setup

Environment Variables

Copy .env.example to .env and configure:

cp .env.example .env
Variable Default Description
PORT 8080 HTTP server port
API_KEY (empty) Bearer token for authentication
OPENAPI_URL http://localhost:8080/v1 LLM API endpoint
OPENAPI_API_KEY (empty) LLM API key
MODEL_NAME gpt-4o LLM model to use
CHUNK_SIZE 4000 Characters per chunk
OVERLAP 200 Characters of overlap between chunks
TARGET_INTERMEDIATE_SUMMARY_LENGTH 150 Words per chunk summary
MAX_DIRECT_SUMMARY_LENGTH 100 Max final summary length
MAX_DIRECT_TEXT_LENGTH 8000 Max text length before chunking

Running

Docker

# Build
docker build -t mcp-summary .

# Run with environment file
docker run -p 8080:8080 --env-file .env mcp-summary

# Run with inline environment variables
docker run -p 8080:8080 \
  -e OPENAPI_URL=http://localhost:8080/v1 \
  -e OPENAPI_API_KEY=your-key \
  -e MODEL_NAME=gpt-4o \
  mcp-summary

Python

pip install -r requirements.txt
python mcp_summary_server.py

MCP Tool

summarize_document

Summarizes a document, automatically handling chunking for long text.

Parameters:

  • text (string, required): The document text to summarize
  • max_length (integer, optional): Maximum summary length in words (default: 100)

Returns:

{
  "summary": "The summarized text...",
  "original_length": 12345,
  "method": "direct",  // or "chunked"
  "chunks": 1  // number of chunks used
}