EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
Language:Python
Total stars: 4220
Stars trend:
#python
#evaluationframework, #languagemodel, #transformer
A framework for few-shot evaluation of language models.
Language:Python
Total stars: 4220
Stars trend:
2 Mar 2024
9pm ▍ +3
10pm +0
11pm +0
3 Mar 2024
12am ▏ +1
1am ▎ +2
2am ▉ +7
3am ▉ +7
4am ▍ +3
5am ▌ +4
6am █ +8
7am █▍ +11
8am █▍ +11
#python
#evaluationframework, #languagemodel, #transformer
confident-ai/deepeval
The LLM Evaluation Framework
Language:Python
Total stars: 6493
Stars trend:
#python
#evaluationframework, #evaluationmetrics, #llmevaluation, #llmevaluationframework, #llmevaluationmetrics
The LLM Evaluation Framework
Language:Python
Total stars: 6493
Stars trend:
22 May 2025
9am ▏ +1
10am +0
11am +0
12pm ▎ +2
1pm ██▏ +17
2pm █▏ +9
3pm █▏ +9
4pm █▎ +10
5pm ▊ +6
6pm █▏ +9
7pm ▊ +6
8pm █▏ +9
#python
#evaluationframework, #evaluationmetrics, #llmevaluation, #llmevaluationframework, #llmevaluationmetrics
promptfoo/promptfoo
Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
Language:TypeScript
Total stars: 6971
Stars trend:
#typescript
#ci, #cicd, #cicd, #evaluation, #evaluationframework, #llm, #llmeval, #llmevaluation, #llmevaluationframework, #llmops, #pentesting, #promptengineering, #prompttesting, #prompts, #rag, #redteaming, #testing, #vulnerabilityscanners
Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
Language:TypeScript
Total stars: 6971
Stars trend:
31 May 2025
9pm ▌ +4
10pm ▋ +5
11pm ▏ +1
1 Jun 2025
12am █ +8
1am ▋ +5
2am ▊ +6
3am █▏ +9
4am ▊ +6
5am ▉ +7
6am █▏ +9
7am ▍ +3
8am █▋ +13
#typescript
#ci, #cicd, #cicd, #evaluation, #evaluationframework, #llm, #llmeval, #llmevaluation, #llmevaluationframework, #llmops, #pentesting, #promptengineering, #prompttesting, #prompts, #rag, #redteaming, #testing, #vulnerabilityscanners