Grok 3 : Comparison in Science and Math to other GPT Models
At the launch of Grok 3 Event, a detailed benchmark comparison was presented, showcasing the performance of various AI models across different domains.
The chart highlighted the capabilities of Grok-3, Grok-3 mini, Gemini Pro, DeepSeek-V3, Claude 3.5 Sonnet, and GPT-4o in Math (AIME '24), Science (GPQA), and Coding (LCB Oct-Feb). Grok-3 demonstrated a strong performance, particularly in Science with a score of 75, while Grok-3 mini showed a balanced performance across all three categories.
This visual representation provided a clear insight into how Grok-3 stacks up against its competitors in these specialized areas, emphasizing its advanced capabilities in the AI landscape.
@webthreeth
At the launch of Grok 3 Event, a detailed benchmark comparison was presented, showcasing the performance of various AI models across different domains.
The chart highlighted the capabilities of Grok-3, Grok-3 mini, Gemini Pro, DeepSeek-V3, Claude 3.5 Sonnet, and GPT-4o in Math (AIME '24), Science (GPQA), and Coding (LCB Oct-Feb). Grok-3 demonstrated a strong performance, particularly in Science with a score of 75, while Grok-3 mini showed a balanced performance across all three categories.
This visual representation provided a clear insight into how Grok-3 stacks up against its competitors in these specialized areas, emphasizing its advanced capabilities in the AI landscape.
@webthreeth
Reasoning + Test-Time Compute : Grok-3 vs Other Models
During the launch of Grok 3 Event, a comprehensive comparison of various AI models' reasoning capabilities was showcased through the "Reasoning + Test-Time Compute" benchmark chart.
This chart illustrated the performance of Grok-3 Reasoning Beta, Grok-3 mini Reasoning, o3mini(high), o1, Deepseek-R1, and Gemini-2 Flash Thinking across three domains: Math (AIME'24), Science (GPQA), and Coding (LCB Oct-Feb).
Grok-3 Reasoning Beta led with impressive scores of 96 in Math, 85 in Science, and 79 in Coding, demonstrating its superior reasoning capabilities. The comparison highlighted the advancements in AI reasoning and computational efficiency, setting a new benchmark in the field.
@webthreeth
During the launch of Grok 3 Event, a comprehensive comparison of various AI models' reasoning capabilities was showcased through the "Reasoning + Test-Time Compute" benchmark chart.
This chart illustrated the performance of Grok-3 Reasoning Beta, Grok-3 mini Reasoning, o3mini(high), o1, Deepseek-R1, and Gemini-2 Flash Thinking across three domains: Math (AIME'24), Science (GPQA), and Coding (LCB Oct-Feb).
Grok-3 Reasoning Beta led with impressive scores of 96 in Math, 85 in Science, and 79 in Coding, demonstrating its superior reasoning capabilities. The comparison highlighted the advancements in AI reasoning and computational efficiency, setting a new benchmark in the field.
@webthreeth
SKYworkAI released the best video generation AI model to date. SkyReels V1 540p, developed by SkyworkAI, stands out from other video generation models due to its superior performance across various metrics.
According to the comparison data, SkyReels V1 achieves the highest overall score of 82.43, surpassing models like VideoCrafter-2.0 VEnhancer and CogVideoX1.5-5B.
Finetuned on HunyuanVideo with 10M film and TV clips, the T2V model delivers advanced facial animation, capturing 33 distinct expressions with over 400 natural movement combinations.
@webthreeth
A lesson for Ethiopia AI Developers: French Mistral AI launches Mistral Saba 24B, an AI model that is designed specific to MENA RegionMistral AI has launched Mistral Saba, a groundbreaking 24-billion-parameter AI model specifically designed for the Middle Eastern and South Asian markets. This model focuses on enhancing interactions in Arabic, offering a culturally sensitive approach that general-purpose models often lack. It's trained on datasets from these regions, providing more accurate and relevant responses in Arabic than larger, more general models.
Interestingly, due to the cultural and linguistic overlap between the Middle East and South Asia, Saba also performs well with Indian-origin languages, particularly South Indian languages like Tamil and Malayalam.
This release represents Mistral AI's strategic move to cater to the unique needs of these markets, offering a solution that is both efficient in terms of speed (processing over 150 tokens per second) and cost-effective.
@webthreeth
Solana's Price Plunge: The Bursting of the Meme Coin Bubble
The Solana (SOL) price has experienced a significant decline of 40% over the past month, primarily driven by the bursting of the meme coin bubble. The initial surge in SOL's price was fueled by the popularity of meme coins on the Solana blockchain, with projects like Pump.fun becoming hotbeds for minting and trading these speculative assets.
However, as the hype around meme coins, including high-profile tokens like MELANIA and LIBRA, faded, investor confidence waned. Reports of insider trading and pump-and-dump schemes further exacerbated the downturn, leading to a substantial loss in value and trading volume for Solana. This has resulted in SOL falling below $170, its lowest since early November, reflecting a broader market sentiment shift away from meme coin speculation.
@webthreeth
According to LM Arena, Grok III is the leader in all measuring fields for AI. Let's see how competitions respond over the next couple of weeks.
@webthreeth
@webthreeth
SWE-Lancer: Benchmarking AI in Real-World Software EngineeringSWE-Lancer is a new benchmark introduced by OpenAI to evaluate the performance of AI models in real-world freelance software engineering tasks.
It comprises over 1,400 tasks sourced from Upwork, collectively valued at $1 million in real-world payouts. SWE-Lancer tests models on a wide range of tasks from basic bug fixes to complex feature implementations, spanning the full engineering stack from UI/UX to systems design. It assesses both independent coding tasks and managerial decisions, where models must choose between different implementation proposals.
Despite the advanced capabilities of current frontier models, they struggle to solve the majority of these tasks, highlighting the gap between AI's potential and its practical application in complex software engineering scenarios.
@webthreeth
I hate that politicians got involved and killed the Solana based Memecoins. They really destroyed the chance to build generational wealth. I had bought 1 SOL, and it is tanking like 🤣
🤣1
Google Launches Its AI Co-Scientist to Transform Biomedical ResearchGoogle's AI Co-Scientist, unveiled on February 19, 2025, is an innovative tool built on the Gemini 2.0 platform, designed to accelerate scientific discovery by assisting researchers in biomedical fields.
This multi-agent AI system can generate novel hypotheses, create detailed research plans, and summarize relevant literature based on natural language inputs from scientists, such as exploring new applications for existing drugs or understanding disease mechanisms.
Unlike fully autonomous systems, it acts as a collaborative partner, enhancing human expertise rather than replacing it, and has been tested by researchers at institutions like Stanford University and Imperial College London, showing promise in areas like liver fibrosis research.
@webthreeth
xAI Makes Grok 3 Free for Everyone: Unlock the Power of AI by trying new features
xAI announced via a post on X that Grok 3, described as "the world’s smartest AI," is now available for free to all users—until the servers melt, as humorously noted in the post. The post also highlights that X Premium+ and SuperGrok users will enjoy increased access to Grok 3, along with early access to advanced features like Voice Mode.
Trained on xAI’s Colossus supercomputer, which utilizes over 200,000 GPUs and 10 times the computing power of its predecessor, Grok 3, Grok 2, the model features advanced reasoning modes like "Think" and "Big Brain," as well as a new DeepSearch tool for real-time research and summarization.
@webthreeth
Majorana 1 Chip Unveiled: Microsoft Harnesses Fourth State of Matter for Quantum Leap Forward
The Majorana 1 chip, unveiled by Microsoft on February 19, 2025, marks a revolutionary step in quantum computing by introducing topological qubits that rely on a newly created fourth state of matter. Developed using advanced materials known as topoconductors, this quantum processing unit harnesses this exotic state—beyond the familiar solid, liquid, and gas—to enable smaller, faster, and more stable qubits.
The quantum processing unit is designed to be smaller, faster, and more stable than previous qubit technologies, addressing long-standing challenges like error rates and scalability in quantum systems. Named after physicist Ettore Majorana, the chip harnesses exotic quasiparticles to create a robust computational framework, potentially enabling the development of a million-qubit processor.
@webthreeth
Shifting tides: Bitcoin Ownership Trends in 2024In 2024, the distribution of Bitcoin ownership saw significant shifts among various entity types, as depicted in the bar chart.
Funds and ETFs, along with businesses, increased their Bitcoin holdings, indicating a rise in institutional adoption. Meanwhile, governments and the "Other" category, which includes Bitcoin to be mined, holdings in smart contracts, and estimated lost Bitcoin, saw reductions in their ownership.
Most notably, individuals experienced a substantial decline in their Bitcoin holdings, suggesting a broader trend where institutional and corporate entities are increasingly dominating the cryptocurrency landscape while individual ownership diminishes.
@webthreeth
According to Reports, OpenAI's GPT-4.5 Set to Launch Next Week, GPT-5 Slated for MayOpenAI is gearing up for two major releases in the near future: GPT-4.5, codenamed Orion, is slated to launch as early as next week, while GPT-5 is anticipated in late May, promising significant updates including the integration of the o3 reasoning model.
GPT-4.5 will mark the final non-chain-of-thought model from OpenAI, as the company shifts its focus toward unifying its o-series and GPT-series models with GPT-5, aiming to deliver a more seamless user experience.
This unification effort is part of OpenAI’s broader strategy to reduce confusion around model selection by moving toward a singular, cohesive AI intelligence system.
@webthreeth
👍1
Google Veo 2 Makes Historic Debut Exclusively on Freepik
Google Veo 2, developed by Google DeepMind, is an advanced AI-powered video generation model that has set a new standard in the realm of artificial intelligence creativity.
Google Veo 2, an advanced AI video generation model developed by Google DeepMind, has made a groundbreaking global debut today, February 21, 2025, through an exclusive partnership with Freepik, as announced in their X post.
Veo 2 builds on its initial December 2024 launch via Google Labs’ VideoFX, now reaching a broader audience with Freepik offering the first 10,000 users two free video generations.
@webthreeth
Google Veo 2, developed by Google DeepMind, is an advanced AI-powered video generation model that has set a new standard in the realm of artificial intelligence creativity.
Google Veo 2, an advanced AI video generation model developed by Google DeepMind, has made a groundbreaking global debut today, February 21, 2025, through an exclusive partnership with Freepik, as announced in their X post.
Veo 2 builds on its initial December 2024 launch via Google Labs’ VideoFX, now reaching a broader audience with Freepik offering the first 10,000 users two free video generations.
@webthreeth
👍1
DeepTutor by Opennote: Revolutionizing Personalized Learning with AI
DeepTutor by Opennote is an innovative AI-powered deep research tool designed specifically for learning, launched on February 21, 2025.
Billed as the world’s first and fastest of its kind, it aims to transform the internet into a personalized tutor for users. Developed by the team at Opennote, founded by UCLA students Obby and Rishi, DeepTutor leverages custom AI models to provide highly accurate, tailored responses to open-ended queries, fostering a deeper understanding of complex topics.
Building on Opennote’s mission to revolutionize STEM education through interactive and personalized tools, DeepTutor integrates seamlessly with the platform’s ecosystem, which already includes features like animated video lessons and real-time visualization tools.
This tool represents a significant step toward making advanced, research-driven education accessible and efficient for learners worldwide.
@webthreeth
DeepTutor by Opennote is an innovative AI-powered deep research tool designed specifically for learning, launched on February 21, 2025.
Billed as the world’s first and fastest of its kind, it aims to transform the internet into a personalized tutor for users. Developed by the team at Opennote, founded by UCLA students Obby and Rishi, DeepTutor leverages custom AI models to provide highly accurate, tailored responses to open-ended queries, fostering a deeper understanding of complex topics.
Building on Opennote’s mission to revolutionize STEM education through interactive and personalized tools, DeepTutor integrates seamlessly with the platform’s ecosystem, which already includes features like animated video lessons and real-time visualization tools.
This tool represents a significant step toward making advanced, research-driven education accessible and efficient for learners worldwide.
@webthreeth
👍1
Introducing Lucy: Ethiopia’s First Multilingual AI Voice Assistant & Chatbot
Lucy, Ethiopia’s first AI-powered multilingual voice assistant and chatbot, has been officially launched. Developed to enhance accessibility and bridge language barriers, Lucy supports Amharic, Afaan Oromo, and Tigrinya—the country’s three most spoken languages—alongside English and over 100 international languages.
The AI assistant offers real-time news updates, business process guidance, speech-to-speech translation, and curated Ethiopian-specific information. Designed to drive digital transformation, Lucy also provides support for visually impaired communities by assisting with daily tasks through AI.
The project, led by Zemenu, was made possible through collaborations with Microsoft Azure, Goethe-Institut Äthiopien, Balchut Creatives, BITS College, and Creative Hub Ethiopia.
@webthreeth
Lucy, Ethiopia’s first AI-powered multilingual voice assistant and chatbot, has been officially launched. Developed to enhance accessibility and bridge language barriers, Lucy supports Amharic, Afaan Oromo, and Tigrinya—the country’s three most spoken languages—alongside English and over 100 international languages.
The AI assistant offers real-time news updates, business process guidance, speech-to-speech translation, and curated Ethiopian-specific information. Designed to drive digital transformation, Lucy also provides support for visually impaired communities by assisting with daily tasks through AI.
The project, led by Zemenu, was made possible through collaborations with Microsoft Azure, Goethe-Institut Äthiopien, Balchut Creatives, BITS College, and Creative Hub Ethiopia.
@webthreeth
👍2
Bybit Breach: North Korea’s Lazarus Group Steals $1.5B in Crypto Hack
Bybit, one of the largest cryptocurrency exchanges, suffered a major security breach, marking it as potentially the largest crypto hack in history. Hackers stole approximately $1.4 to $1.5 billion worth of Ethereum (ETH) and related tokens, such as stETH and mETH, from the exchange’s Ethereum cold wallet.
The attack involved a sophisticated method where the attackers manipulated the user interface (UI) during a routine transfer from a cold wallet to a warm wallet, masking the signing interface to display a legitimate address while altering the underlying smart contract logic. This deception tricked wallet signers into approving a malicious transaction, allowing the hackers to gain control of the wallet and transfer the funds to unidentified addresses.
The hack triggered a significant outflow of funds, with over $5 billion in total assets withdrawn from the platform, including a “bank run” by users fearing insolvency.
@webthreeth
DeepSeek Kicks Off OpenSourceWeek with FlashMLA: Revolutionizing AI on Hopper GPUs
In a groundbreaking move during OpenSourceWeek, DeepSeek, a leading Chinese AI startup, unveiled FlashMLA, an optimized decoding kernel for NVIDIA Hopper GPUs, as highlighted in their X post on February 24, 2025. This innovation, now in production, enhances the efficiency of machine learning applications by supporting variable-length sequences with BF16 precision and a paged KV cache (block size 64), delivering an impressive 3000 GB/s memory bandwidth and 580 TFLOPS compute performance on the H800 GPU.
This release underscores DeepSeek’s commitment to advancing open-source AI, positioning it as a formidable player against major American AI developers, with its cost-effective and high-performing solutions already gaining attention for rivaling top models like those from OpenAI, as noted in recent tech reports.
Github Link - https://github.com/deepseek-ai/FlashMLA
GitHub
GitHub - deepseek-ai/FlashMLA: FlashMLA: Efficient Multi-head Latent Attention Kernels
FlashMLA: Efficient Multi-head Latent Attention Kernels - deepseek-ai/FlashMLA
Bitcoin Mining Costs: A Global Divide
The cost of electricity required to mine 1 Bitcoin varies dramatically worldwide, as shown in this striking map. Ethiopia stands out as the least expensive at $2,506, thanks to its access to affordable hydroelectric power, while Ireland tops the chart at over $321,112, driven by its high energy prices and reliance on imported fuels.
The map’s color gradient—from green for the cheapest regions to red for the most expensive—reveals a clear global divide, with countries in Africa and parts of Asia offering significant cost advantages, while many in Europe and North America face steep expenses, shaping where Bitcoin mining thrives.
Source - NFT Evening
@webthreeth
The cost of electricity required to mine 1 Bitcoin varies dramatically worldwide, as shown in this striking map. Ethiopia stands out as the least expensive at $2,506, thanks to its access to affordable hydroelectric power, while Ireland tops the chart at over $321,112, driven by its high energy prices and reliance on imported fuels.
The map’s color gradient—from green for the cheapest regions to red for the most expensive—reveals a clear global divide, with countries in Africa and parts of Asia offering significant cost advantages, while many in Europe and North America face steep expenses, shaping where Bitcoin mining thrives.
Source - NFT Evening
@webthreeth
Frontier Model Intelligence Over Time by AI Lab
The graph, spanning from late 2022 to early 2025, shows a clear upward trend in intelligence scores, with models like OpenAI's GPT-4 Turbo, GPT-4o (May 2024), and xAI's Grok 3 Reasoning Beta demonstrating significant advancements. Notable models include OpenAI's GPT-3.5 Turbo and GPT-4, Google's offerings, DeepSeek's models, Anthropic's contributions, Meta's developments, and xAI's Grok series, with Grok 3 Reasoning Beta reaching the highest intelligence index of 75 as of February 21, 2025, based on lab claims for unreleased models.
The chart highlights the rapid evolution and competition in AI development, with xAI's Grok models, including Grok 1, Grok 2, and Grok 3, showing substantial improvements over time.
@webthreeth