MCP Summary Server

An MCP (Model Context Protocol) server for document summarization that keeps full text out of the chat context window.

Features

Automatically determines whether to summarize directly or use chunked summarization
All processing happens server-side
Returns only the summary to the client
Configurable chunking parameters
Bearer token authentication (optional)

Setup

Environment Variables

Copy .env.example to .env and configure:

cp .env.example .env

Variable	Default	Description
PORT	8080	HTTP server port
API_KEY	(empty)	Bearer token for authentication
OPENAPI_URL	http://localhost:8080/v1	LLM API endpoint
OPENAPI_API_KEY	(empty)	LLM API key
MODEL_NAME	gpt-4o	LLM model to use
CHUNK_SIZE	4000	Characters per chunk
OVERLAP	200	Characters of overlap between chunks
TARGET_INTERMEDIATE_SUMMARY_LENGTH	150	Words per chunk summary
MAX_DIRECT_SUMMARY_LENGTH	100	Max final summary length
MAX_DIRECT_TEXT_LENGTH	8000	Max text length before chunking

Running

Docker

# Build
docker build -t mcp-summary .

# Run with environment file
docker run -p 8080:8080 --env-file .env mcp-summary

# Run with inline environment variables
docker run -p 8080:8080 \
  -e OPENAPI_URL=http://localhost:8080/v1 \
  -e OPENAPI_API_KEY=your-key \
  -e MODEL_NAME=gpt-4o \
  mcp-summary

Python

pip install -r requirements.txt
python mcp_summary_server.py

MCP Tool

summarize_document

Summarizes a document, automatically handling chunking for long text.

Parameters:

text (string, required): The document text to summarize
max_length (integer, optional): Maximum summary length in words (default: 100)

Returns:

{
  "summary": "The summarized text...",
  "original_length": 12345,
  "method": "direct",  // or "chunked"
  "chunks": 1  // number of chunks used
}

2.0 KiB Raw Blame History