This commit is contained in:
ortizcruz
2023-09-07 16:04:39 +02:00
parent 0283e16aa5
commit 4a7a283854
+44 -13
View File
@@ -1,23 +1,24 @@
# `AutoTranscript`: Fully Automated Transcription using AI # `ScrAIbe: Streamlined Conversation Recording with Automated Intelligence Based Environment`
`AutoTranscript` is a [PyTorch](https://pytorch.org/) based interface speech-to-text tool to generate fully automated transcriptions. AutoTranscript uses AI models containing speaker diarization models:
`ScrAIbe` is a [PyTorch](https://pytorch.org/) based interface speech-to-text tool to generate fully automated transcriptions. AutoTranscript uses AI models containing speaker diarization models:
- [whisper](https://github.com/openai/whisper): A general-purpose speech recognition model. - [whisper](https://github.com/openai/whisper): A general-purpose speech recognition model.
- [payannote-audio](https://github.com/pyannote/pyannote-audio): An open-source toolkit for speaker diarization-. - [payannote-audio](https://github.com/pyannote/pyannote-audio): An open-source toolkit for speaker diarization-.
`AutoTranscript` can be used as a command-line interface, a webserver, or as a Python API. ## Install `ScrAIbe` :
## Install `AutoTranscript` :
The following command will pull and install the latest commit from this repository, along with its Python dependencies. The following command will pull and install the latest commit from this repository, along with its Python dependencies.
pip install https://github.com/JSchmie/autotranscript.git pip install git+https://github.com/JSchmie/autotranscript.git
- **Python version**: Python 3.9 - **Python version**: Python 3.9
- **PyTorch version**: Python 1.11.0 - **PyTorch version**: Python 1.11.0
## Usage examples ## Usage
`AutoTranscript` can be used as a command-line interface, a webserver, or as a Python API.
### Python usage ### Python usage
@@ -32,37 +33,67 @@ print(f"Transcription: \n{text}")
``` ```
Refer to [whisper](https://github.com/openai/whisper) and [payannote-audio](https://github.com/pyannote/pyannote-audio) for further options.
### Command-line usage ### Command-line usage
If you do not want to control the optimization using Python, you also can use the command-line:
You can also run ScrAIbe in a [Gradio App](https://github.com/gradio-app/gradio) interface using the following command-line:
autotranscript audio.wav autotranscript audio.wav
Some example of important functionalities are:
- `--task`: Task to be performed, either transcription, diarization or translation into English. Default is transcription.
- `--hf-token`: To download the models, a Hugging Face token must be generated. Check [Hugging Face](https://huggingface.co/docs/hub/security-tokens) for further information on how to do that.
- `--server-name`: Name of the Web Server. If empty 127.0.0.1 or 0.0.0.0 will be used
- `--whisper-model-name`: Name of the [whisper](https://github.com/openai/whisper) model to be used. Default is `medium`.
Run the following to view all available options: Run the following to view all available options:
autotranscript -h autotranscript -h
### Documentation usage ## Documentation
To access the documentation run the following command from the docs/_build/html directory: For further insights check the [documentation page](https://cristinaortizcruz.github.io/Test/).
python -m http.server ## Contributions
We are happy for any interest in contributing: In order to do that, fork the repo and use merge requests to incorporate your contribution.
## Roadmap ## Roadmap
The following milestones are planned for the further development of ScrAIbe:
- Model quantization - Model quantization
Quantization to empower memory and computational efficiency.
- Model fine-tuning - Model fine-tuning
In order to be able to cover a variety of linguistic phenomena.
For example, currently ScrAIbe is able to transcribe word by word, but ignores filler words or speech pauses.
These phenomena can be addressed by fine-tuning with the corresponding data.
- Implementation of LLMs - Implementation of LLMs
One example is the implementation of a summarization or extraction model, which enables ScrAIbe to automatically summarize or retrieve the key information out of a generated transcription, which could be the minutes of a meeting.
- Executable for Windows - Executable for Windows
## Contact ## Contact
For queries contact Jacob Schmieder at Jacob.Schmieder@dbfz.de For queries contact [Jacob Schmieder](Jacob.Schmieder@dbfz.de)
## License ## License
<!-- licensing missing? Apache 2.0 -->
ScrAIbe is licensed under (tbd).
## Acknowledgments ## Acknowledgments
Special thanks go to the colleagues of the KIDA project - especially the teams in I5 and I2 - and the BMEL (Bundesministerium für Ernährung und Landwirtschaft). <!--add KIDA, MRI, DBFZ, BMEL logos-->
Special thanks go to the KIDA project and the BMEL (Bundesministerium für Ernährung und Landwirtschaft), especially to the AI Consultancy Team and the Infrastructure Team.