admin/scribe

Fork 0

T

CristinaOrtizCruz 7653123d9c logos

2023-09-08 14:16:56 +02:00

autotranscript

final codebase rework

2023-08-24 16:12:28 +02:00

autotranscript.egg-info

readme update, logos

2023-09-08 13:52:09 +02:00

docs

first draft docu with sphinx html

2023-08-31 08:48:46 +02:00

app.py

add webapp

2023-06-30 18:46:33 +02:00

BMEL_dark.png

readme update, logos

2023-09-08 13:52:09 +02:00

BMEL.jpg

readme update, logos

2023-09-08 13:52:09 +02:00

DBFZ_dark.png

readme update, logos

2023-09-08 13:52:09 +02:00

DBFZ.png

readme update, logos

2023-09-08 13:52:09 +02:00

environment.yml

updated dependency files

2023-06-09 10:51:52 +02:00

gradio_app.py

final codebase rework

2023-08-24 16:12:28 +02:00

kida_dark.png

readme update, logos

2023-09-08 13:52:09 +02:00

kida.png

readme update, logos

2023-09-08 13:52:09 +02:00

MRI.png

readme update, logos

2023-09-08 13:52:09 +02:00

README.md

logos

2023-09-08 14:16:56 +02:00

requirements.txt

requirements and readme update

2023-09-01 12:11:34 +02:00

setup.py

updated setup.py

2023-07-07 12:57:47 +02:00

test_autotranscript.py

added unittests

2023-06-14 16:30:15 +02:00

transcribe.py

final codebase rework

2023-08-24 16:12:28 +02:00

README.md

`ScrAIbe: Streamlined Conversation Recording with Automated Intelligence Based Environment`

ScrAIbe is a PyTorch based interface speech-to-text tool to generate fully automated transcriptions. AutoTranscript uses AI models containing speaker diarization models:

whisper: A general-purpose speech recognition model.
payannote-audio: An open-source toolkit for speaker diarization-.

Install `ScrAIbe` :

The following command will pull and install the latest commit from this repository, along with its Python dependencies.

pip install git+https://github.com/JSchmie/autotranscript.git

Python version: Python 3.9
PyTorch version: Python 1.11.0

Usage

AutoTranscript can be used as a command-line interface, a webserver, or as a Python API.

Python usage

from autotranscript import AutoTranscribe

model = AutoTranscribe()

text = model.transcribe("audio.wav")

print(f"Transcription: \n{text}")

Refer to whisper and payannote-audio for further options.

Command-line usage

You can also run ScrAIbe in a Gradio App interface using the following command-line:

autotranscript audio.wav

Some example of important functionalities are:

--task: Task to be performed, either transcription, diarization or translation into English. Default is transcription.
--hf-token: To download the models, a Hugging Face token must be generated. Check Hugging Face for further information on how to do that.
--server-name: Name of the Web Server. If empty 127.0.0.1 or 0.0.0.0 will be used
--whisper-model-name: Name of the whisper model to be used. Default is medium.

Run the following to view all available options:

autotranscript -h

Documentation

For further insights check the documentation page.

Contributions

We are happy for any interest in contributing: In order to do that, fork the repo and use merge requests to incorporate your contribution.

Roadmap

The following milestones are planned for the further development of ScrAIbe:

Model quantization
Quantization to empower memory and computational efficiency.
Model fine-tuning
In order to be able to cover a variety of linguistic phenomena.

For example, currently ScrAIbe is able to transcribe word by word, but ignores filler words or speech pauses. These phenomena can be addressed by fine-tuning with the corresponding data.

Implementation of LLMs
One example is the implementation of a summarization or extraction model, which enables ScrAIbe to automatically summarize or retrieve the key information out of a generated transcription, which could be the minutes of a meeting.
Executable for Windows

Contact

For queries contact Jacob Schmieder

License

ScrAIbe is licensed under (tbd).

Acknowledgments

Special thanks go to the KIDA project and the BMEL (Bundesministerium für Ernährung und Landwirtschaft), especially to the AI Consultancy Team and the Infrastructure Team.