`AutoTranscript`: Fully Automated Transcription using AI

AutoTranscript is a PyTorch based interface speech-to-text tool to generate fully automated transcriptions. AutoTranscript uses AI models containing speaker diarization models:

whisper: A general-purpose speech recognition model.
payannote-audio: An open-source toolkit for speaker diarization-.

AutoTranscript can be used as a command-line interface, a webserver, or as a Python API.

Install `AutoTranscript` :

The following command will pull and install the latest commit from this repository, along with its Python dependencies.

pip install https://github.com/JSchmie/autotranscript.git

Python version: Python 3.9
PyTorch version: Python 1.11.0

Usage examples

Python usage

from autotranscript import AutoTranscribe

model = AutoTranscribe()

text = model.transcribe("audio.wav")

print(f"Transcription: \n{text}")

Command-line usage

If you do not want to control the optimization using Python, you also can use the command-line:

autotranscript audio.wav

Run the following to view all available options:

autotranscript -h

Documentation usage

To access the documentation run the following command from the docs/_build/html directory:

python -m http.server

Roadmap

Model quantization
Model fine-tuning
Implementation of LLMs
Executable for Windows

Contact

For queries contact Jacob Schmieder at Jacob.Schmieder@dbfz.de

License

Acknowledgments

Special thanks go to the colleagues of the KIDA project - especially the teams in I5 and I2 - and the BMEL (Bundesministerium für Ernährung und Landwirtschaft).

1.7 KiB Raw Blame History

AutoTranscript: Fully Automated Transcription using AI

Install AutoTranscript :