diff --git a/README.md b/README.md index b3bf01f..0c963f9 100644 --- a/README.md +++ b/README.md @@ -1,23 +1,24 @@ -# `AutoTranscript`: Fully Automated Transcription using AI +# `ScrAIbe: Streamlined Conversation Recording with Automated Intelligence Based Environment` -`AutoTranscript` is a [PyTorch](https://pytorch.org/) based interface speech-to-text tool to generate fully automated transcriptions. AutoTranscript uses AI models containing speaker diarization models: + +`ScrAIbe` is a [PyTorch](https://pytorch.org/) based interface speech-to-text tool to generate fully automated transcriptions. AutoTranscript uses AI models containing speaker diarization models: - [whisper](https://github.com/openai/whisper): A general-purpose speech recognition model. - [payannote-audio](https://github.com/pyannote/pyannote-audio): An open-source toolkit for speaker diarization-. -`AutoTranscript` can be used as a command-line interface, a webserver, or as a Python API. - -## Install `AutoTranscript` : +## Install `ScrAIbe` : The following command will pull and install the latest commit from this repository, along with its Python dependencies. - pip install https://github.com/JSchmie/autotranscript.git + pip install git+https://github.com/JSchmie/autotranscript.git - **Python version**: Python 3.9 - **PyTorch version**: Python 1.11.0 -## Usage examples +## Usage + +`AutoTranscript` can be used as a command-line interface, a webserver, or as a Python API. ### Python usage @@ -32,37 +33,67 @@ print(f"Transcription: \n{text}") ``` +Refer to [whisper](https://github.com/openai/whisper) and [payannote-audio](https://github.com/pyannote/pyannote-audio) for further options. + ### Command-line usage -If you do not want to control the optimization using Python, you also can use the command-line: + +You can also run ScrAIbe in a [Gradio App](https://github.com/gradio-app/gradio) interface using the following command-line: autotranscript audio.wav +Some example of important functionalities are: + +- `--task`: Task to be performed, either transcription, diarization or translation into English. Default is transcription. +- `--hf-token`: To download the models, a Hugging Face token must be generated. Check [Hugging Face](https://huggingface.co/docs/hub/security-tokens) for further information on how to do that. +- `--server-name`: Name of the Web Server. If empty 127.0.0.1 or 0.0.0.0 will be used +- `--whisper-model-name`: Name of the [whisper](https://github.com/openai/whisper) model to be used. Default is `medium`. + + Run the following to view all available options: autotranscript -h -### Documentation usage +## Documentation -To access the documentation run the following command from the docs/_build/html directory: +For further insights check the [documentation page](https://cristinaortizcruz.github.io/Test/). - python -m http.server +## Contributions + +We are happy for any interest in contributing: In order to do that, fork the repo and use merge requests to incorporate your contribution. ## Roadmap -- Model quantization -- Model fine-tuning -- Implementation of LLMs +The following milestones are planned for the further development of ScrAIbe: + +- Model quantization +Quantization to empower memory and computational efficiency. + +- Model fine-tuning +In order to be able to cover a variety of linguistic phenomena. + +For example, currently ScrAIbe is able to transcribe word by word, but ignores filler words or speech pauses. +These phenomena can be addressed by fine-tuning with the corresponding data. + +- Implementation of LLMs +One example is the implementation of a summarization or extraction model, which enables ScrAIbe to automatically summarize or retrieve the key information out of a generated transcription, which could be the minutes of a meeting. + - Executable for Windows ## Contact -For queries contact Jacob Schmieder at Jacob.Schmieder@dbfz.de +For queries contact [Jacob Schmieder](Jacob.Schmieder@dbfz.de) ## License + +ScrAIbe is licensed under (tbd). + ## Acknowledgments -Special thanks go to the colleagues of the KIDA project - especially the teams in I5 and I2 - and the BMEL (Bundesministerium für Ernährung und Landwirtschaft). + + +Special thanks go to the KIDA project and the BMEL (Bundesministerium für Ernährung und Landwirtschaft), especially to the AI Consultancy Team and the Infrastructure Team. +