Code Stars

EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
Language:Python
Total stars: 4220
Stars trend:

2 Mar 2024
 9pm ▍ +3
10pm  +0
11pm  +0
3 Mar 2024
12am ▏ +1
 1am ▎ +2
 2am ▉ +7
 3am ▉ +7
 4am ▍ +3
 5am ▌ +4
 6am █ +8
 7am █▍ +11
 8am █▍ +11

#python
#evaluationframework, #languagemodel, #transformer

74 views09:16

Code Stars

confident-ai/deepeval
The LLM Evaluation Framework
Language:Python
Total stars: 6493
Stars trend:

22 May 2025
 9am ▏ +1
10am  +0
11am  +0
12pm ▎ +2
 1pm ██▏ +17
 2pm █▏ +9
 3pm █▏ +9
 4pm █▎ +10
 5pm ▊ +6
 6pm █▏ +9
 7pm ▊ +6
 8pm █▏ +9

#python
#evaluationframework, #evaluationmetrics, #llmevaluation, #llmevaluationframework, #llmevaluationmetrics

94 views21:18

Code Stars

promptfoo/promptfoo
Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
Language:TypeScript
Total stars: 6971
Stars trend:

31 May 2025
 9pm ▌ +4
10pm ▋ +5
11pm ▏ +1
1 Jun 2025
12am █ +8
 1am ▋ +5
 2am ▊ +6
 3am █▏ +9
 4am ▊ +6
 5am ▉ +7
 6am █▏ +9
 7am ▍ +3
 8am █▋ +13

#typescript
#ci, #cicd, #cicd, #evaluation, #evaluationframework, #llm, #llmeval, #llmevaluation, #llmevaluationframework, #llmops, #pentesting, #promptengineering, #prompttesting, #prompts, #rag, #redteaming, #testing, #vulnerabilityscanners

96 views09:17

About

Blog

Apps

Platform