Dominance of the United States in Notable AI Models in 2024
According to the 2025 AI Index Report by Epoch AI, the United States led the world in the development of notable AI models in 2024, with a total of 40 models, far surpassing other regions. China followed with 15 models, while France contributed 3.
Canada, Israel, Saudi Arabia, and South Korea each produced 1 notable AI model, highlighting a significant gap in AI innovation across different geographic areas. This data underscores the U.S.'s dominant position in AI research and development, likely driven by its robust tech ecosystem, substantial investments, and concentration of leading AI companies.
@webthreeth
According to the 2025 AI Index Report by Epoch AI, the United States led the world in the development of notable AI models in 2024, with a total of 40 models, far surpassing other regions. China followed with 15 models, while France contributed 3.
Canada, Israel, Saudi Arabia, and South Korea each produced 1 notable AI model, highlighting a significant gap in AI innovation across different geographic areas. This data underscores the U.S.'s dominant position in AI research and development, likely driven by its robust tech ecosystem, substantial investments, and concentration of leading AI companies.
@webthreeth
Significant Decline in AI Inference Costs Across Benchmarks (2022-2024)
Between 2022 and 2024, the cost of utilizing large language models (LLMs) for inference has decreased dramatically, reflecting substantial advancements in AI efficiency.
In November 2022, achieving performance equivalent to GPT-3.5, which scores 64.8 on the MMLU benchmark—a widely recognized metric for evaluating language model capabilities—required an expenditure of $20.00 per million tokens. By October 2024, this cost had plummeted to just $0.07 per million tokens with models like Gemini-1.5-Flash-8B, marking a 280-fold reduction in under two years.
A parallel trend is observed in the more rigorous GPQA benchmark, where models scoring above 50% saw inference costs drop from $15 per million tokens in May 2024 to $0.12 per million tokens by December 2024, as exemplified by Phi 4. According to Epoch AI, LLM inference costs have been declining at a rate of nine to 900 times per year, depending on the specific task.
@webthreeth
Between 2022 and 2024, the cost of utilizing large language models (LLMs) for inference has decreased dramatically, reflecting substantial advancements in AI efficiency.
In November 2022, achieving performance equivalent to GPT-3.5, which scores 64.8 on the MMLU benchmark—a widely recognized metric for evaluating language model capabilities—required an expenditure of $20.00 per million tokens. By October 2024, this cost had plummeted to just $0.07 per million tokens with models like Gemini-1.5-Flash-8B, marking a 280-fold reduction in under two years.
A parallel trend is observed in the more rigorous GPQA benchmark, where models scoring above 50% saw inference costs drop from $15 per million tokens in May 2024 to $0.12 per million tokens by December 2024, as exemplified by Phi 4. According to Epoch AI, LLM inference costs have been declining at a rate of nine to 900 times per year, depending on the specific task.
@webthreeth
Google's Gemini Coder: A New Frontier in AI-Driven Programming Excellence
Google's upcoming release of "Gemini Coder," is poised to be a groundbreaking addition to the Gemini model family, likely building on the advanced coding capabilities of Gemini 2.5 Pro, which already excels in reasoning and web development tasks with a 1-million token context window, according to Google DeepMind.
The claim that it will be the "world’s best coding model" suggests it may outperform existing benchmarks like SWE-bench, where Gemini 2.5 Pro has set a high standard using Google’s unique scaffolding and re-scoring methods.
@webthreeth
Google's upcoming release of "Gemini Coder," is poised to be a groundbreaking addition to the Gemini model family, likely building on the advanced coding capabilities of Gemini 2.5 Pro, which already excels in reasoning and web development tasks with a 1-million token context window, according to Google DeepMind.
The claim that it will be the "world’s best coding model" suggests it may outperform existing benchmarks like SWE-bench, where Gemini 2.5 Pro has set a high standard using Google’s unique scaffolding and re-scoring methods.
@webthreeth
👍1🔥1
DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level
The Agentica project open-source initiative to democratize reinforcement learning (RL) techniques and develop scalable systems for large language models (LLMs) and agents, which is a project at Berkley University. The Agentica Project recently unveiled DeepCoder-14B-Preview, a fully open-sourced 14 billion parameter AI model that rivals OpenAI's o1 and o3-mini in coding and math tasks, achieving an impressive 60.6% Pass@1 accuracy on LiveCodeBench—an 8% improvement over its base, Deepseek-R1-Distilled-Qwen-14B.
What sets this release apart is the project's commitment to transparency, sharing not only the model but also the dataset, code, and training recipe, empowering the community to replicate or further develop it.
Model - https://huggingface.co/agentica-org/DeepCoder-14B-Preview
By the way, do you want me to do an analysis on the type of open-sources found in AI? Please like the post.
@webthreeth
The Agentica project open-source initiative to democratize reinforcement learning (RL) techniques and develop scalable systems for large language models (LLMs) and agents, which is a project at Berkley University. The Agentica Project recently unveiled DeepCoder-14B-Preview, a fully open-sourced 14 billion parameter AI model that rivals OpenAI's o1 and o3-mini in coding and math tasks, achieving an impressive 60.6% Pass@1 accuracy on LiveCodeBench—an 8% improvement over its base, Deepseek-R1-Distilled-Qwen-14B.
What sets this release apart is the project's commitment to transparency, sharing not only the model but also the dataset, code, and training recipe, empowering the community to replicate or further develop it.
Model - https://huggingface.co/agentica-org/DeepCoder-14B-Preview
By the way, do you want me to do an analysis on the type of open-sources found in AI? Please like the post.
@webthreeth
Gemini 2.5 Pro Powers Ahead: Deep Research Outshines OpenAI in User Preference Evaluations
Google's Deep Research feature, now powered by the Gemini 2.5 Pro model since April 8 2025, which outperforms OpenAI’s Deep Research in Google’s evaluations.
As shown in the attached comparison chart where human testers preferred Gemini Deep Research 69% overall compared to OpenAI’s 30.1%. The chart further details Gemini 2.5 Pro’s edge in specific areas: 60.6% preference in instruction following (versus OpenAI’s 39.4%), 76.3% in comprehensiveness (versus 23.1%), 73.3% in completeness (versus 26.7%), and 58.2% in writing quality (versus 41.8%), demonstrating its superior performance across multiple dimensions.
@webthreeth
Google's Deep Research feature, now powered by the Gemini 2.5 Pro model since April 8 2025, which outperforms OpenAI’s Deep Research in Google’s evaluations.
As shown in the attached comparison chart where human testers preferred Gemini Deep Research 69% overall compared to OpenAI’s 30.1%. The chart further details Gemini 2.5 Pro’s edge in specific areas: 60.6% preference in instruction following (versus OpenAI’s 39.4%), 76.3% in comprehensiveness (versus 23.1%), 73.3% in completeness (versus 26.7%), and 58.2% in writing quality (versus 41.8%), demonstrating its superior performance across multiple dimensions.
@webthreeth
👍1
Dire Wolves Resurrected, Meme Coins Unleashed: The $13.6M REMUS Frenzy before being Rug-Pulled
The introduction of REMUS, a meme coin based on the Solana blockchain, was triggered by Colossal Biosciences' announcement regarding the genetic revival of dire wolves, extinct for 10,000 years. This news led to a rapid increase in the coin's market capitalization, reaching an impressive $13.6 million within an hour.
However, as the initial excitement faded, the market experienced a decline. In contrast, similar projects like "Romulus" and "Khaleesi" struggled, collectively achieving only about $2 million in market value. This situation highlights the volatile nature of the meme coin market. One trader notably transformed an initial investment of $1,000 into a substantial profit of $108,000. This trend exemplifies the phenomenon of fear of missing out (FOMO) that drives enthusiasm around meme coins.
@webthreeth
The introduction of REMUS, a meme coin based on the Solana blockchain, was triggered by Colossal Biosciences' announcement regarding the genetic revival of dire wolves, extinct for 10,000 years. This news led to a rapid increase in the coin's market capitalization, reaching an impressive $13.6 million within an hour.
However, as the initial excitement faded, the market experienced a decline. In contrast, similar projects like "Romulus" and "Khaleesi" struggled, collectively achieving only about $2 million in market value. This situation highlights the volatile nature of the meme coin market. One trader notably transformed an initial investment of $1,000 into a substantial profit of $108,000. This trend exemplifies the phenomenon of fear of missing out (FOMO) that drives enthusiasm around meme coins.
@webthreeth
GSMA Launches Report on Ethiopia’s Emerging AI Landscape
Ethiopia is poised to leverage the transformative power of Artificial Intelligence (AI) across various industries, as outlined in the latest GSMA report, “AI in Ethiopia: Promising use cases for development,” released on April 8, 2025.
This comprehensive study delves into the country’s burgeoning AI ecosystem, highlighting its potential to revolutionize sectors such as agriculture, healthcare, education, and entrepreneurship. The report examines innovative case studies of pioneering public and private organizations leading the charge in AI deployment, while also addressing the challenges and opportunities tied to AI talent, data availability, and computational capacity.
With detailed insights into key sectors primed for AI impact and actionable recommendations for policymakers, investors, and AI leaders, the report offers a roadmap for Ethiopia. You can find the study below in this link:
https://lnkd.in/dcbV79ce
@webthreeth
Ethiopia is poised to leverage the transformative power of Artificial Intelligence (AI) across various industries, as outlined in the latest GSMA report, “AI in Ethiopia: Promising use cases for development,” released on April 8, 2025.
This comprehensive study delves into the country’s burgeoning AI ecosystem, highlighting its potential to revolutionize sectors such as agriculture, healthcare, education, and entrepreneurship. The report examines innovative case studies of pioneering public and private organizations leading the charge in AI deployment, while also addressing the challenges and opportunities tied to AI talent, data availability, and computational capacity.
With detailed insights into key sectors primed for AI impact and actionable recommendations for policymakers, investors, and AI leaders, the report offers a roadmap for Ethiopia. You can find the study below in this link:
https://lnkd.in/dcbV79ce
@webthreeth
👍3
Harmonizing Innovation: Google Might Realesed its Text-to-Audio Soon on Gemini
Lyria, Google DeepMind’s advanced AI music generation model, is set to revolutionize how we create music by turning text prompts into high-quality songs with vocals and instrumentals. Introduced in November 2023, it’s already showing promise with features like Dream Track on YouTube Shorts and Music AI Tools, allowing users to craft custom soundtracks or transform hums into full compositions.
As of April 2025, Lyria has been integrated into Google’s Vertex AI platform, signaling that a wide public release is coming soon, making this powerful tool accessible beyond its initial limited testing phase with select creators and artists.
@webthreeth
Lyria, Google DeepMind’s advanced AI music generation model, is set to revolutionize how we create music by turning text prompts into high-quality songs with vocals and instrumentals. Introduced in November 2023, it’s already showing promise with features like Dream Track on YouTube Shorts and Music AI Tools, allowing users to craft custom soundtracks or transform hums into full compositions.
As of April 2025, Lyria has been integrated into Google’s Vertex AI platform, signaling that a wide public release is coming soon, making this powerful tool accessible beyond its initial limited testing phase with select creators and artists.
@webthreeth
Firebase Studio Unveiled: Google Launches Agentic Development Platform on a trial-basis in April 2025
Firebase Studio, launched by Google in April 2025, is a cloud-based, agentic development environment designed to streamline the creation of full-stack AI-infused applications. Accessible directly from a browser, it integrates tools like Project IDX, Genkit, and Gemini in Firebase, enabling developers to rapidly prototype, build, and deploy production-quality apps, including APIs, backends, frontends, and mobile interfaces.
With support for natural language prompts, a variety of frameworks like Next.js, and features such as real-time collaboration and one-click publishing via Firebase App Hosting, it caters to both novice and experienced developers. Currently in preview, Firebase Studio offers up to three free workspaces per user, with expanded options available through the Google Developer Program, making it a versatile and powerful platform for modern app development.
https://idx.google.com/
@webthreeth
Firebase Studio, launched by Google in April 2025, is a cloud-based, agentic development environment designed to streamline the creation of full-stack AI-infused applications. Accessible directly from a browser, it integrates tools like Project IDX, Genkit, and Gemini in Firebase, enabling developers to rapidly prototype, build, and deploy production-quality apps, including APIs, backends, frontends, and mobile interfaces.
With support for natural language prompts, a variety of frameworks like Next.js, and features such as real-time collaboration and one-click publishing via Firebase App Hosting, it caters to both novice and experienced developers. Currently in preview, Firebase Studio offers up to three free workspaces per user, with expanded options available through the Google Developer Program, making it a versatile and powerful platform for modern app development.
https://idx.google.com/
@webthreeth
👍1
OpenAI's Upcoming AI Models: Insights from ChatGPT Web App Update
OpenAI is preparing to introduce three new AI models—o4-mini, o4-mini-high, and o3 (full)—as indicated by a recent update in the ChatGPT web app. This update includes JavaScript code with case statements referencing these models, suggesting their integration into ChatGPT's backend, while o3 appears to be currently limited to Deep Research applications.
This information aligns with OpenAI's ongoing research focus on generative models and the importance of aligning AI with human values, although no official timeline or specific details about the models have been confirmed.
@webthreeth
OpenAI is preparing to introduce three new AI models—o4-mini, o4-mini-high, and o3 (full)—as indicated by a recent update in the ChatGPT web app. This update includes JavaScript code with case statements referencing these models, suggesting their integration into ChatGPT's backend, while o3 appears to be currently limited to Deep Research applications.
This information aligns with OpenAI's ongoing research focus on generative models and the importance of aligning AI with human values, although no official timeline or specific details about the models have been confirmed.
@webthreeth
👍2
BrowseComp: a benchmark created by OpenAI for browsing agents
BrowseComp is a meticulously designed benchmark that tests an AI’s ability to locate hard-to-find information through strategic web browsing and reasoning. It centers on short, fact-based questions with single, indisputable answers that are crafted to be challenging both for AI systems and human solvers.
By using "inverted" questions—where the correct answer is difficult to uncover yet straightforward to verify—BrowseComp forces models to pursue creative and persistent search strategies, rather than relying on brute-force methods. Its development involved rigorous checks, including ensuring that prevalent models like GPT‑4o struggled without advanced reasoning and browsing techniques.
Notably, while standard models achieved near-zero accuracy, an agent specifically trained for deep research managed to solve over half of the problems, highlighting the importance of combining robust reasoning with effective tool.
@webthreeth
BrowseComp is a meticulously designed benchmark that tests an AI’s ability to locate hard-to-find information through strategic web browsing and reasoning. It centers on short, fact-based questions with single, indisputable answers that are crafted to be challenging both for AI systems and human solvers.
By using "inverted" questions—where the correct answer is difficult to uncover yet straightforward to verify—BrowseComp forces models to pursue creative and persistent search strategies, rather than relying on brute-force methods. Its development involved rigorous checks, including ensuring that prevalent models like GPT‑4o struggled without advanced reasoning and browsing techniques.
Notably, while standard models achieved near-zero accuracy, an agent specifically trained for deep research managed to solve over half of the problems, highlighting the importance of combining robust reasoning with effective tool.
@webthreeth
👍1
Unveiling Ethiopia's Past: ሊነጋ ነው Captivates Over 200K Viewers in Just 3 Days
The AI-generated short film "ሊነጋ ነው", released by EHUD AI Studio on April 7, 2025, has achieved remarkable success, garnering over 200,000 views and 2.3 million impressions within just three days on the EHUD AI Studio YouTube channel (https://www.youtube.com/@ehudai). This political psychological thriller, which explores the legacies of Ethiopia’s past leaders, saw a view count of 203.3K, three days after its re-realese on April 7 2025.
Congrats 👏
@webthreeth
The AI-generated short film "ሊነጋ ነው", released by EHUD AI Studio on April 7, 2025, has achieved remarkable success, garnering over 200,000 views and 2.3 million impressions within just three days on the EHUD AI Studio YouTube channel (https://www.youtube.com/@ehudai). This political psychological thriller, which explores the legacies of Ethiopia’s past leaders, saw a view count of 203.3K, three days after its re-realese on April 7 2025.
Congrats 👏
@webthreeth
Supercharge Your Career: Unleashing NotebookLM for Interview Prep and Beyond
NotebookLM, a powerful AI tool from Google, powered by Gemini 2.5 Pro, is revolutionizing career development by transforming how job seekers prepare for interviews and boost their professional profiles.
By uploading your resume and target job descriptions, NotebookLM’s innovative audio overview feature generates a podcast-style discussion where virtual "hosts" analyze your skills, highlight how they align with the role, and offer tailored advice for interview questions—streamlining your preparation process.
Beyond interviews, this tool leverages its document summarization capabilities to help you refine your resume for applicant tracking systems (ATS), which 83% of employers now use in 2025, ensuring your application stands out. Whether you’re crafting study guides from multiple sources or seeking actionable insights to elevate your career narrative, NotebookLM is your personal hiring coach.
@webthreeth
NotebookLM, a powerful AI tool from Google, powered by Gemini 2.5 Pro, is revolutionizing career development by transforming how job seekers prepare for interviews and boost their professional profiles.
By uploading your resume and target job descriptions, NotebookLM’s innovative audio overview feature generates a podcast-style discussion where virtual "hosts" analyze your skills, highlight how they align with the role, and offer tailored advice for interview questions—streamlining your preparation process.
Beyond interviews, this tool leverages its document summarization capabilities to help you refine your resume for applicant tracking systems (ATS), which 83% of employers now use in 2025, ensuring your application stands out. Whether you’re crafting study guides from multiple sources or seeking actionable insights to elevate your career narrative, NotebookLM is your personal hiring coach.
@webthreeth
Should I upload tips like this on how you can use AI not only for your daily work, but also strategic growth of both yourself, but also your business?
👍1
Web 3.0 Ethiopia - DeFi & AI
Supercharge Your Career: Unleashing NotebookLM for Interview Prep and Beyond NotebookLM, a powerful AI tool from Google, powered by Gemini 2.5 Pro, is revolutionizing career development by transforming how job seekers prepare for interviews and boost their…
I genuinely advise this tool for anyone who is working in an knowledge-intensive sector. It is probably one of the best tools shipped after GPT-O Series models. Very impressive one by Google.
Google aren't marketing their products like ChatGPT. Trust me, Gemini is the leader in the market for LLMs now, purely based on performance (At least to my field of work).
Google aren't marketing their products like ChatGPT. Trust me, Gemini is the leader in the market for LLMs now, purely based on performance (At least to my field of work).
💯1
Perplexity’s Telegram Bot: A New Way to Search
Perplexity, an AI-powered search engine, has launched a bot on Telegram, bringing its conversational search capabilities to the popular messaging platform. This bot, accessible by searching for " @askplexbot ," allows users to ask questions directly within Telegram, receiving quick, accurate answers backed by Perplexity’s advanced language models.
Whether used in private chats or group conversations, the bot offers a seamless way to explore topics, research on the go, or spark discussions with friends. This move makes Perplexity’s knowledge-discovery tools more accessible, blending the convenience of Telegram with the power of AI-driven insights.
@webthreeth
Perplexity, an AI-powered search engine, has launched a bot on Telegram, bringing its conversational search capabilities to the popular messaging platform. This bot, accessible by searching for " @askplexbot ," allows users to ask questions directly within Telegram, receiving quick, accurate answers backed by Perplexity’s advanced language models.
Whether used in private chats or group conversations, the bot offers a seamless way to explore topics, research on the go, or spark discussions with friends. This move makes Perplexity’s knowledge-discovery tools more accessible, blending the convenience of Telegram with the power of AI-driven insights.
@webthreeth
S&P 500 Steals Bitcoin’s Thunder: Volatility Surges Amid Tariff Turmoil
The S&P 500’s 10-day historical volatility spiking to 76.8%, outstripping Bitcoin’s 72.9%, marks a rare moment where traditional markets have eclipsed the crypto world’s notorious price swings, as reported by Bloomberg. ETF analysts emphasize that typical volatility for the S&P 500 lingers between 10–15%, making this jump a significant deviation from the norm.
The catalyst appears to be fresh trade tariffs imposed by President Trump’s administration, particularly a hefty 145% duty on Chinese imports, which has sent shockwaves through global markets. This policy, coupled with China’s retaliatory 125% tariffs on U.S. goods, has fueled uncertainty, driving wild fluctuations in stock indices.
@webthreeth
The S&P 500’s 10-day historical volatility spiking to 76.8%, outstripping Bitcoin’s 72.9%, marks a rare moment where traditional markets have eclipsed the crypto world’s notorious price swings, as reported by Bloomberg. ETF analysts emphasize that typical volatility for the S&P 500 lingers between 10–15%, making this jump a significant deviation from the norm.
The catalyst appears to be fresh trade tariffs imposed by President Trump’s administration, particularly a hefty 145% duty on Chinese imports, which has sent shockwaves through global markets. This policy, coupled with China’s retaliatory 125% tariffs on U.S. goods, has fueled uncertainty, driving wild fluctuations in stock indices.
@webthreeth
Meta's Llama 4 Maverick Misstep: Ranking Plunge Sparks AI Benchmark Controversy
Meta faced backlash after it was revealed they submitted an experimental version of their Llama 4 Maverick model, optimized for conversational flair, to the LM Arena benchmark, securing a high ranking of #2.
This version, dubbed "Llama-4-Maverick-03-26-Experimental," differed significantly from the publicly available model, which critics argued misled developers about its real-world performance. After scrutiny, LM Arena re-evaluated the unmodified Maverick model, and its ranking plummeted to 32nd, exposing discrepancies in Meta's approach.
The incident sparked debates about transparency in AI benchmarking, with some accusing Meta of gaming the system to inflate their model's standing, though Meta maintained they were merely experimenting with custom variants.
@webthreeth
Meta faced backlash after it was revealed they submitted an experimental version of their Llama 4 Maverick model, optimized for conversational flair, to the LM Arena benchmark, securing a high ranking of #2.
This version, dubbed "Llama-4-Maverick-03-26-Experimental," differed significantly from the publicly available model, which critics argued misled developers about its real-world performance. After scrutiny, LM Arena re-evaluated the unmodified Maverick model, and its ranking plummeted to 32nd, exposing discrepancies in Meta's approach.
The incident sparked debates about transparency in AI benchmarking, with some accusing Meta of gaming the system to inflate their model's standing, though Meta maintained they were merely experimenting with custom variants.
@webthreeth
OpenAI's A-SWE: The Future of Autonomous Software Engineering In Progress
On March 5, 2025, Sarah Friar, OpenAI's CFO, announced the development of 'A-SWE,' an autonomous software engineer agent capable of independently building apps, performing quality assurance, bug testing, and documentation—tasks often disliked by human engineers—potentially disrupting existing collaborative AI tools like Devin, as it functions as a standalone engineer rather than an assistant like GitHub's Copilot; this announcement, made public on April 12, 2025, aligns with OpenAI's broader AI research goals that include deep research and operator agents, though no AGI timeline was specified.
@webthreeth
On March 5, 2025, Sarah Friar, OpenAI's CFO, announced the development of 'A-SWE,' an autonomous software engineer agent capable of independently building apps, performing quality assurance, bug testing, and documentation—tasks often disliked by human engineers—potentially disrupting existing collaborative AI tools like Devin, as it functions as a standalone engineer rather than an assistant like GitHub's Copilot; this announcement, made public on April 12, 2025, aligns with OpenAI's broader AI research goals that include deep research and operator agents, though no AGI timeline was specified.
@webthreeth
Canva Unveils AI-Powered Code Generation Feature with Canva Code
Canva recently launched Canva Code, an innovative AI-powered feature that enables users to create interactive digital elements like pricing calculators, countdown timers, and educational games without coding skills, utilizing text prompts and a conversational AI interface with voice command support to generate code instantly, seamlessly integrating these creations into Canva designs such as websites, presentations, and social posts, while offering a preview panel for quick refinements, secure AI settings with customizable safeguards for organizational use, and versatile applications ranging from personal projects like dynamic itinerary builders to business tools like interactive product guides, all accessible for free to Canva Free, Pro, and Teams users with a gradual rollout planned over the coming months.
@webthreeth
Canva recently launched Canva Code, an innovative AI-powered feature that enables users to create interactive digital elements like pricing calculators, countdown timers, and educational games without coding skills, utilizing text prompts and a conversational AI interface with voice command support to generate code instantly, seamlessly integrating these creations into Canva designs such as websites, presentations, and social posts, while offering a preview panel for quick refinements, secure AI settings with customizable safeguards for organizational use, and versatile applications ranging from personal projects like dynamic itinerary builders to business tools like interactive product guides, all accessible for free to Canva Free, Pro, and Teams users with a gradual rollout planned over the coming months.
@webthreeth