DPS Build – Telegram

DPS Build

720 subscribers

120 photos

3 videos

10 files

462 links

AI, coding, data science and startups

Download Telegram

About

Blog

Apps

Platform

720 subscribers

模块化笔记本电脑 Framework 发布新的 Framework 16 系列，除了延续之前的设计以外，还增加了显卡模块和几乎可以无限拓展的信号输入模块：

it also brings in two new module ecosystems: a fully reconfigurable input deck and modular, upgradeable graphics.

https://frame.work/fr/fr/blog/introducing-the-framework-laptop-16

Introducing the Framework Laptop 16

We’re excited to share our next major product category, a high-performance 16” notebook, the Framework Laptop 16.

344 views15:29

这个插件把我写的都写完了，以后直接调用这个插件就能结合自己的知识库来使用 ChatGPT API https://github.com/openai/chatgpt-retrieval-plugin

使用 pinecone 这个向量型数据库存储 embedding 数据，作为 ChatGPT API 的自定义知识库。

https://github.com/pinecone-io/examples/blob/master/generation/chatgpt/plugins/langchain-docs-plugin.ipynb

343 views17:12

摩尔定律之父去世

https://www.intel.com/content/www/us/en/newsroom/news/gordon-moore-obituary.html

Gordon Moore, Intel Co-Founder, Dies at 94

Moore, who set the course for the future of the semiconductor industry, devoted his later years to philanthropy.

324 views02:02

斯坦福开源了一个自行搭建 LLaMA 的架构指南 Alpaca，有人算了算了，大概花 $600 就能训练出一个表现类似 GPT3.5 的大语言模型。 https://crfm.stanford.edu/alpaca/ https://twitter.com/yanndubs/status/1635339256532205568

Databricks 开放了基于 Alpaca 的 Dolly，单个集群 (single-node cluster with node type having 8 A100 GPUs) 三小时可以完成训练

https://github.com/databrickslabs/dolly

GitHub - databrickslabs/dolly: Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform

Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform - databrickslabs/dolly

334 views02:05

但是对公司而言，让自己跑得更快远比让竞争对手跑得慢一点更重要。所以大部分情况下保密措施应该是以不伤害效率为前提的。对用户数据的保密除外，但是保护用户数据的措施通常不会影响到大部分人的工作效率。

理论上这些都可以被滥用或者误用，但是滥用往往缺乏动机，误用可以从设计上避免。一个大原则是风险可控或可逆的事情默认是没有流程的，只有实际发生了问题，证明必要时才会靠引入流程来解决。有了流程就需要有人审批有人执行，如果它解决的问题不常发生、有其他方案或者产生的危害不如流程带来的成本，那么设立流程就是不理性的。

有的比较卷的团队为了控制代码的复杂度，还把自己 code base 的行数上限放到了测试里。如果有人增加了 10 行代码，就需要重构其他地方的代码来省出 10 行，或者提供一个好的理由来提高上限。

除了日常的 code review 外，每个新员工会需要学习公司的代码规范，并通过工作中会用到的每个语言的可读性 review。方式是准备一个百行以上的 changelist，提交给一个有资格做 readability review 的工程师，通过之后才有权限提交用在生产环境的代码。

所以 Google 把版本管理完全倒了过来，每个项目/组件都只要维护一个最新版，所有的改动最重要的原则是不能破坏任何测试。所以如果有人在一个共享组件里做了向前不兼容的改动，就会需要在同一个 changelist 里把整个代码库里所有调用到这个接口的地方改过来。

GWS 每周会做一次 binary push，也就是二进制文件的发布。流程是每周一早上负责发布的工程师从当前的代码做一个发布分支编译出一个二进制文件，交给 QA 开始测试，发现 bug 就把修复 cherry pick 到发布分支。

https://1byte.io/google-large-scale-dev/

👍1

346 views02:43

一些朋友们做的技术频道，大家可以按需订阅：

https://t.me/sannsaku

https://t.me/amneumarkt

https://t.me/foreseaz_collection

https://t.me/LinghaoCh

观察花鸟鱼虫。

334 views07:33

微软发布了一整条基于 LLM 的开发链：

Semantic Kernel (SK) is a lightweight SDK enabling integration of AI Large Language Models (LLMs) with conventional programming languages. The SK extensible programming model combines natural language semantic functions, traditional code native functions, and embeddings-based memory unlocking new potential and adding value to applications with AI.

https://github.com/microsoft/semantic-kernel

GitHub - microsoft/semantic-kernel: Integrate cutting-edge LLM technology quickly and easily into your apps

Integrate cutting-edge LLM technology quickly and easily into your apps - microsoft/semantic-kernel

333 views03:25

Apple 官方的 neural engine 推理加速 SDK — 直接让 PyTorch 的推理速度提速十倍 Use ane_transformers as a reference PyTorch implementation if you are considering deploying your Transformer models on Apple devices with an A14 or newer and M1 or newer chip to achieve up to 10 times…

关于 Apple neural engine 的细节

https://github.com/hollance/neural-engine

GitHub - hollance/neural-engine: Everything we actually know about the Apple Neural Engine (ANE)

Everything we actually know about the Apple Neural Engine (ANE) - hollance/neural-engine

322 views04:25

一个极其详尽的 ML 工程资源

https://madewithml.com/

Home - Made With ML by Anyscale

Learn how to responsibly design, develop, deploy and iterate on production ML applications.

363 views05:35

利用 ChatGPT API 来总结 Sam Altman 的访谈

https://reccap.it/recaps/sam-altman-openai-ceo-on-gpt-4-chatgpt-and-the-future-of-ai-lex-fridman-podcast--38c54630577d44a0b5423d623dccc254

Sam Altman: OpenAI CEO on GPT-4, ChatGPT, and the Future of AI | Lex Fridman Podcast #367

Reccap enables you learning from Youtube videos at your own pace. Given a video, Reccap can extract the slides, the high-level summary and all key poinst. Reccap makes your learning and resarching 10x faster and effectiv Reccap makes your learning and resarching…

357 views06:40

ChatGPT 救了一只狗：

In the meantime, it occurred to me that medical diagnostics seemed like the sort of thing GPT4 could potentially be really good at, so I described the situation in great detail.

I gave it the actual transcribed blood test results from multiple days, and asked for a diagnosis

When we reached the second vet, I asked if it's possible it might be IMHA.

The vet agreed that it's a possible diagnosis. They drew blood, where they noticed visible agglutination.

After numerous other tests, the diagnosis was confirmed. GPT4 was right.

https://twitter.com/peakcooper/status/1639716822680236032

361 views08:46

Sparks of Artificial General Intelligence: Early experiments with GPT-4

https://arxiv.org/pdf/2303.12712v1.pdf

347 views10:40

API 届的 IFTTT

https://pipedream.com

Pipedream | Connect APIs, AI, databases and more

Pipedream is the fastest way to build powerful applications that connect all the services in your stack, with code-level control when you need it and no code when you don't.

366 views04:05

Here we show a proof of concept using OpenAI’s chatgpt-retrieval-plugin with Meta’s LLaMA language model.
This is more than just a guide. It is a call-to-action to build an open protocol for foundation model plugins allowing us to share plugins across LLMs, and govern their interactions.

https://medium.com/m/global-identity-2?redirectUrl=https%3A%2F%2Fblog.lastmileai.dev%2Fusing-openais-retrieval-plugin-with-llama-d2e0b6732f14

GitHub - openai/chatgpt-retrieval-plugin: The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking…

The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language. - openai/chatgpt-retrieval-plugin

384 views07:10

一些中外的独立开发者，包括他们的作品和独立开发经历：

https://reorx.com/blog/indie-makers-im-following/

我关注的独立开发者们 | Reorx’s Forge

列举一些我所知的独立开发者们，让更多的人看到他们的经历和作品，获得启发。

👍1

384 views10:10

使用 postgres 的 pgvetor 插件来存储 embedding 数据，作为 LLM 的输入。

https://supabase.com/blog/openai-embeddings-postgres-vector

Storing OpenAI embeddings in Postgres with pgvector

An example of how to build an AI-powered search engine using OpenAI's embeddings and PostgreSQL.

👍1

404 views16:40

围绕着 ChatGPT API 写了两周代码，记录一些想法： 1. ChatGPT API 自 gpt-turbo-3.5 发布以来，做了大大的简化。只需要在请求里写两个参数：model 和 messages，其他参数都被隐藏了。 2. 需要调整输出的话，只需要在 messages 写 prompts，通过自然语言就能控制模型的输出。大大降低了开发难度，又给输出添加了无限可能 3. 不仅 API 的交互得以大大简化，围绕着 ChatGPT API 开发的话，也可以大大简化整个 NLP 项目的开发。它…

最近又被拉着写 prompt。大前提是，隔壁组的数据出了问题，他们期望用 ChatGPT API 来批量清洗数据，他们在 ChatGPT UI 上做了测试，然后丢到我们手上。

我了解了需求之后，没看他们的 prompt，直接凭经验开始做各种尝试，最后试出了一个还不错的 prompt。于是封装成函数之后，交给后端的同事集成到流水线上。

我们跑了一小批数据，结果还不错，但是对照着隔壁组的要求，似乎不完全一致。于是后端同事直接把隔壁组的 prompt 搬进流水线里，又测试了一遍，结果和 ChatGPT UI 上的结果完全不一样。我们猜测是因为 ChatGPT UI 上还有一些后处理的逻辑。又测试了一些 prompts 之后，最后还是我提供的 prompts 效果最好。

Takeaways:

1. ChatGPT UI 和 ChatGPT API 的 prompts 不完全一致，前者有后处理逻辑，后者应该是模型直接的输出结果；
2. prompts 的确需要不断地尝试，所以 prompt engineering 可能真的是一门学问。

👍9

441 viewsedited 01:49

最近又被拉着写 prompt。大前提是，隔壁组的数据出了问题，他们期望用 ChatGPT API 来批量清洗数据，他们在 ChatGPT UI 上做了测试，然后丢到我们手上。我了解了需求之后，没看他们的 prompt，直接凭经验开始做各种尝试，最后试出了一个还不错的 prompt。于是封装成函数之后，交给后端的同事集成到流水线上。我们跑了一小批数据，结果还不错，但是对照着隔壁组的要求，似乎不完全一致。于是后端同事直接把隔壁组的 prompt 搬进流水线里，又测试了一遍，结果和 ChatGPT UI…

论如何写 prompts 的重要性：

https://twitter.com/cydiar404/status/1640399013345214479

👎1

396 views04:50

关于 Apple neural engine 的细节 https://github.com/hollance/neural-engine

读了一遍 repo 里所有的文档，有些很有意思的点：

1. Apple Neural Engine (ANE) 是和 CPU/GPU 不一样的计算核心，专门用来处理神经网络的计算。理想的情况下，最好把所有神经网络的计算任务交给 ANE，而不是 CPU 或者 GPU；

2. 但是呢，我们无法强制指定任务给 ANE，只能告诉任务去尝试使用 ANE。大体的原因是因为，不是所有的 layer 都可以在 ANE 里计算。Core ML 会把合适的 layer 放在 ANE 里计算，把不合适的放在 CPU 或者 GPU 里计算；

3. 也就是说 Core ML 会自行判断具体的 layer 调用哪个计算核心，CPU / GPU / ANE，所以开发者只能不断尝试，将不支持的 layer 替换成支持 ANE 的 layer。

388 views06:43

湾区日报恢复更新了。，只剩这个网站了，iOS app 什么的都没有了。

https://www.wanqu.co/

关注创业，互联网，技术。就像是你远方的老朋友每天推荐几篇优质英文文章，一起每天进步一点点。在 AI
横行的年代，更需要人类的手动推荐，手动点评；人类好的品味不会被 AI 淘汰的。 by W @ San Francisco

396 views08:15

斯坦福开源了一个机械手臂方案，可以做非常精确的操作，比如从钱包里取出证件，用乒乓球拍颠球等等

https://twitter.com/tonyzzhao/status/1640393026341322754

391 views12:10