Code Stars
1.93K subscribers
9.34K photos
9.64K links
Code Stars alerts you to GitHub repos gaining stars rapidly. Stay ahead of the curve and discover trending projects before they go viral! #AI #GitHub #OpenSource #Tech #MachineLearning #Python #Programming #Java #Javascript #React #Docker #Devops
Download Telegram
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift
Language:C++
Total stars: 1120
Stars trend:
6 Jun 2024
4pm ▏ +1
5pm ▏ +1
6pm +0
7pm +0
8pm +0
9pm +0
10pm +0
11pm +0
7 Jun 2024
12am ▍ +3
1am ████▌ +36
2am ███▎ +26
3am ██ +16

#cplusplus
#aarch64, #android, #arm32, #asr, #cpp, #csharp, #dotnet, #ios, #linux, #macos, #mfc, #onnx, #openkylin, #raspberrypi, #riscv, #speechtotext, #texttospeech, #vits, #windows
ictnlp/StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Language:Python
Total stars: 389
Stars trend:
17 Jun 2024
9pm ▏ +1
10pm ▏ +1
11pm ▎ +2
18 Jun 2024
12am +0
1am ▋ +5
2am ▍ +3
3am █▍ +11
4am ███ +24
5am █▋ +13
6am █ +8
7am ▉ +7

#python
#allinone, #asr, #audioprocessing, #machinetranslation, #nonautoregressive, #seamless, #simultaneoustranslation, #speech, #speechenhancement, #speechprocessing, #speechrecognition, #speechsynthesis, #speechtotext, #speechtranslation, #streamingaudio, #texttoaudio, #texttospeech, #translation, #tts, #voice
PeterH0323/Streamer-Sales
Streamer-Sales Top Sales - Sales Anchor LLM Model 🛒🎁, a large sales anchor model that can explain products from the perspective of stimulating users' purchase intention based on given product characteristics. 🚀⭐️Contains detailed data generation process❗️ 📦In addition, it also integrates LMDeploy accelerated reasoning🚀, RAG search enhanced generation 📚, TTS text-to-speech🔊, digital human generation , Agent uses the network to query real-time information🌐, ASR voice-to-text🎙
Language:Python
Total stars: 470
Stars trend:
24 Jun 2024
11pm ▏ +1
25 Jun 2024
12am ▏ +1
1am ██ +16
2am ███▋ +29
3am ██▍ +19
4am █▏ +9

#python
#asr, #chat, #chatapplication, #chatbot, #chatgpt, #digitalhuman, #gpt, #internlmchat7b, #internlm2, #llm, #metahuman, #rag, #textgeneration, #tts
harry0703/AudioNotes
快速提取音视频内容,整理成一份结构化的markdown笔记
Language:Python
Total stars: 194
Stars trend:
22 Jul 2024
12am ▌ +4
1am ▎ +2
2am ▍ +3
3am ▎ +2
4am █ +8
5am ██ +16
6am █▉ +15
7am ██ +16
8am █▎ +10

#python
#ai, #asr, #funasr, #ollama, #python, #qwen2, #whisper
yeyupiaoling/MASR
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
Language:Python
Total stars: 580
Stars trend:
3 Aug 2024
2pm ███████████▎ +90
3pm +0
4pm +0
5pm +0
6pm +0
7pm +0
8pm +0
9pm +0
10pm +0
11pm +0
4 Aug 2024
12am +0
1am ▏ +1

#python
#asr, #conformer, #deeplearning, #deepspeech, #pytorch, #speech, #speechrecognition, #speechtotext, #squeezeformer
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Language:Python
Total stars: 10840
Stars trend:
3 Sep 2024
9am ▏ +1
10am +0
11am +0
12pm ▏ +1
1pm ▏ +1
2pm ▌ +4
3pm ▋ +5
4pm ▌ +4
5pm █▎ +10
6pm ██▏ +17
7pm ██▎ +18
8pm ███▏ +25

#python
#asr, #speech, #speechrecognition, #speechtotext, #whisper
NexaAI/nexa-sdk
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.
Language:Python
Total stars: 1085
Stars trend:
22 Sep 2024
10pm █ +8
11pm ▊ +6
23 Sep 2024
12am ▍ +3
1am ▊ +6
2am █▎ +10
3am ▋ +5
4am █ +8
5am ▌ +4
6am █▏ +9
7am ▌ +4
8am ▌ +4
9am █▏ +9

#python
#asr, #edgecomputing, #languagemodel, #llm, #ondeviceai, #ondeviceml, #sdk, #sdkpython, #stablediffusion, #transformers, #tts, #vlm, #whisper
TEN-framework/TEN-Agent
TEN Agent is the world’s first real-time multimodal agent integrated with the OpenAI Realtime API, RTC, and features weather checks, web search, vision, and RAG capabilities.
Language:Python
Total stars: 1252
Stars trend:
25 Oct 2024
10pm ▏ +1
11pm ▏ +1
26 Oct 2024
12am ▎ +2
1am ▊ +6
2am ██ +16
3am ██ +16
4am █▍ +11
5am ▊ +6
6am █▍ +11
7am ▏ +1
8am █▌ +12

#python
#agent, #ai, #asr, #cpp, #gemini, #golang, #gpt4, #gpt4o, #llm, #lowlatency, #multimodal, #nextjs14, #openai, #python, #rag, #realtime, #realtime, #tts, #vision, #voiceassistant
abus-aikorea/voice-pro
Gradio WebUI for whisper, faster-whisper, whisper-timestamped. Supports YouTube Downloader, Vocal Remover, Transcription, Text-to-Speech, and Translation.
Language:Python
Total stars: 385
Stars trend:
9 Nov 2024
10pm ▏ +1
11pm ▌ +4
10 Nov 2024
12am ▎ +2
1am █▏ +9
2am ██▏ +17
3am █▎ +10
4am ▉ +7
5am ▊ +6
6am ▍ +3
7am ▌ +4
8am ▌ +4
9am █ +8

#python
#asr, #demucs, #fasterwhisper, #gradio, #speechrecognition, #speechsynthesis, #speechtotext, #stt, #subtitles, #texttospeech, #transcription, #translate, #translation, #translator, #tts, #uvr5, #webui, #webui, #whisper, #ytdlp
TEN-framework/TEN-Agent
TEN Agent is a conversational AI powered by the TEN, integrating Gemini 2.0 Live, OpenAI Realtime, RTC, and more. It delivers real-time capabilities to see, hear, and speak, while being fully compatible with popular workflow platforms like Dify and Coze.
Language:Python
Total stars: 4121
Stars trend:
28 Jan 2025
10am ▎ +2
11am ▉ +7
12pm ▉ +7
1pm ▉ +7
2pm █▏ +9
3pm █▌ +12
4pm █▍ +11
5pm ▋ +5
6pm █▌ +12
7pm ▌ +4

#python
#agent, #ai, #asr, #cpp, #gemini, #golang, #gpt4, #gpt4o, #llm, #lowlatency, #multimodal, #nextjs14, #openai, #python, #rag, #realtime, #realtime, #tts, #vision, #voiceassistant
umlx5h/LLPlayer
The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more!
Language:C#
Total stars: 838
Stars trend:
12 Apr 2025
3am ▎ +2
4am ▎ +2
5am ▍ +3
6am ▏ +1
7am ▌ +4
8am ▏ +1
9am ▏ +1
10am ▍ +3
11am █▎ +10
12pm ████▏ +33
1pm ▍ +3
2pm █▋ +13

#csharp
#asr, #csharp, #fasterwhisper, #flyleaf, #languagelearning, #llm, #mediaplayer, #ocr, #ollama, #player, #video, #videoplayer, #whisper, #wpf, #ytdlp
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Language:Python
Total stars: 15449
Stars trend:
6 May 2025
4am ▊ +6
5am █▏ +9
6am ▊ +6
7am ▌ +4
8am █▏ +9
9am ▊ +6
10am ▋ +5
11am █▏ +9
12pm ▍ +3
1pm ▍ +3
2pm █▎ +10
3pm ▉ +7

#python
#asr, #speech, #speechrecognition, #speechtotext, #whisper
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python
Total stars: 13977
Stars trend:
8 May 2025
11am ▉ +7
12pm █▉ +15
1pm ▉ +7
2pm █▏ +9
3pm ▉ +7
4pm ▉ +7
5pm ▊ +6
6pm ▋ +5
7pm ▍ +3
8pm ▌ +4
9pm ▍ +3
10pm ▉ +7

#python
#asr, #deeplearning, #generativeai, #largelanguagemodels, #machinetranslation, #multimodal, #neuralnetworks, #speakerdiariazation, #speakerrecognition, #speechsynthesis, #speechtranslation, #tts
alphacep/vosk-api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Language:Jupyter Notebook
Total stars: 10057
Stars trend:
7 Jun 2025
7pm ▍ +3
8pm ▋ +5
9pm ▎ +2
10pm ▊ +6
11pm ▉ +7
8 Jun 2025
12am ▉ +7
1am ▉ +7
2am ▉ +7
3am █ +8
4am █▎ +10
5am ▋ +5
6am █▏ +9

#jupyternotebook
#android, #asr, #deeplearning, #deepneuralnetworks, #deepspeech, #googlespeechtotext, #ios, #kaldi, #offline, #privacy, #python, #raspberrypi, #speakeridentification, #speakerverification, #speechrecognition, #speechtotext, #speechtotextandroid, #stt, #voicerecognition, #vosk
jdepoix/youtube-transcript-api
This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
Language:Python
Total stars: 4231
Stars trend:
11 Jun 2025
3pm ▉ +7
4pm ▉ +7
5pm ▌ +4
6pm ▋ +5
7pm █▎ +10
8pm ▍ +3
9pm ▎ +2
10pm ▉ +7
11pm ▍ +3
12 Jun 2025
12am ▊ +6
1am ██ +16
2am █▋ +13

#python
#asr, #captions, #cli, #python, #subtitle, #subtitles, #transcript, #transcripts, #translatingtranscripts, #youtube, #youtubeapi, #youtubeasr, #youtubecaptions, #youtubesubtitles, #youtubetranscript, #youtubetranscripts, #youtubevideo