Continuous Learning_Startup & Investment
2.44K subscribers
513 photos
5 videos
16 files
2.74K links
We journey together through the captivating realms of entrepreneurship, investment, life, and technology. This is my chronicle of exploration, where I capture and share the lessons that shape our world. Join us and let's never stop learning!
Download Telegram
https://www.linkedin.com/posts/andrewchen_the-real-story-of-how-facebook-almost-acquired-activity-7076984976591753218-J8l7?utm_source=share&utm_medium=member_desktop

Excellent essay from Noam Bardin:
โ€œThe real story of how Facebook almost acquired Waze, but we ended up with Googleโ€

https://lnkd.in/g76G8u-G

Lots of great learnings, summarized by chatGPT ๐Ÿ˜Ž

1. The co-founders of Waze established a valuation framework before entertaining acquisition offers. They decided to reject offers less than $750M and accept offers above $1B, but would consider proposals in the $750M-$1B range depending on the acquirer.
2. Waze approached potential strategic partners to help accelerate user acquisition, including Microsoft, Amazon, and Facebook, leading to potential acquisition discussions.
3. The founders established relationships with potential acquirers' product teams well in advance, providing a critical foundation for the acquisition process.
4. Initial negotiations with Google ended in a $450M offer which was rejected based on the pre-established valuation framework. This prompted backlash from the board but the founders remained firm.
5. Facebook offered to acquire Waze for $1B swiftly after being informed of a competing offer. Despite initial enthusiasm, the due diligence process revealed gaps and tension between the Waze and Facebook teams, leading to the deal falling through.
6. Following the leak of the Facebook deal, Google presented an unsolicited term sheet of $1.15B. Despite accusations of information leakage, Waze's fiduciary duty led them to consider the offer, leading to a fallout with Facebook.
7. With no counteroffer from Facebook, Waze accepted Google's offer and closed the transaction in eight days.
8. In hindsight, despite the potential financial benefits of a Facebook deal, the Waze co-founder believed Google was the right choice due to cultural fit, their commitment to Waze's independence, and Facebook's subsequent controversies.
9. The lessons learned included: building relationships with potential acquirers early, having a clear valuation framework, recognizing partnership discussions as catalysts for acquisition, understanding the personal nature of acquisitions, and being aware of the divergence in interests between founders and investors during an acquisition.
10. The final key lesson was understanding the power of negotiation, having a red line, and being willing to walk away to secure a better deal.
I like his point of view and learned many things from the leader of one of the largest travel platform.

https://youtu.be/aZ-BjJZxNoA

In this section, Airbnb CEO Brian Chesky discusses his company's approach to AI and how they plan to use it for personalization. Chesky explains that there are several large language models, or base models, which he compares to highways. On top of these base models, companies can build more personalized and tuned models based on their own customer data. Chesky's vision for Airbnb's use of AI involves building robust customer profiles to personalize travel recommendations and becoming the ultimate AI concierge for travelers. He explains that this will require designing unique AI interfaces beyond just text inputs and combining art and science to understand human psychology. In the short-term, Chesky plans to increase productivity by making his engineers 30% more efficient.

Airbnb CEO Brian Chesky discusses the importance of using productivity tools, such as co-pilot and chat CPT, to maximize productivity and efficiency. Chesky also delves into the need for unique interfaces that are custom-designed to meet the specific needs of each task. He adds that AI will be critical for personalizing the customer experience and improving the matching process in the future, allowing for authentic and unique experiences for each individual customer. However, Chesky acknowledges that there is also a risk of machines becoming so advanced that they become difficult to discern from humans, and that identity authentication will be a critical factor going forward.

Airbnb CEO Brian Chesky discusses the importance of brand authenticity and of building a robust personal profile through verifying customersโ€™ identities. He also expresses excitement about the possibilities of AI matching users with delightful experiences, even things they didnโ€™t know would make them happy. Chesky believes that AI will disrupt traditional business models, but also create millions of new startups as it becomes more accessible. He argues that trying to ban AI is like trying to ban electricity, and encourages people to view AI as a tool to be embraced rather than a threat to be feared.

the benefits of AI as a creative tool and the importance of thinking of AI as a tool for creativity. He discusses how AI helps him discover first principles in really interesting ideas, but also notes that we do not know what jobs can be created because they have not been created yet. Finally, they touch on how marketplaces like Airbnb, Etsy and Uber allow people to create new careers for themselves and how this will be incredible for society.
๐Ÿ‘1
Continuous Learning_Startup & Investment
์•ˆ๋…•ํ•˜์„ธ์š”, AGI Town in Seoul์—์„œ ๋‹ค๊ฐ€์˜ค๋Š” ๊ธˆ์š”์ผ ์ €๋… 6:30์— ์—ญ์‚ผ์—ญ ๋ถ€๊ทผ์—์„œ '๊ฒŒ์ž„/์—”ํ„ฐ ์—…๊ณ„์—์„œ์˜ AI ์ ์šฉ'์ด๋ผ๋Š” ์ฃผ์ œ๋กœ ๋ฐ‹์—…์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๐Ÿ„โ€โ™‚๏ธ ํ˜น์‹œ ๊ฒŒ์ž„์—…๊ณ„์—์„œ AI๋ฅผ ์–ด๋–ป๊ฒŒ ์ ์šฉํ•˜๋Š”์ง€ ๊ณ ๋ฏผํ•˜๊ณ  ๊ณ„์…จ๋˜ ๋ถ„, AI ๋ฆฌ์„œ์ฒ˜/๊ฐœ๋ฐœ์ž๋กœ์„œ ๊ฒŒ์ž„ ๋ถ„์•ผ์— ํ™œ์šฉ์— ๋Œ€ํ•ด์„œ ๊ณ ๋ฏผํ–ˆ๊ฑฐ๋‚˜ ๊ด€์‹ฌ์ด ํฌ์‹  ๋ถ„์ด๋ผ๋ฉด ์ด๋ฒˆ Meetup์— ์ฐธ์„ํ•˜์…”์„œ ๊ฐ™์ด ํ† ๋ก ํ•ด๋ด์š”! ์ด๋ฒˆ ๋ฐ‹์—…์—์„œ๋Š” ์•„๋ž˜ ์ฃผ์ œ์— ๋Œ€ํ•ด์„œ ๋‹ค๋ฃฐ ์˜ˆ์ •์ž…๋‹ˆ๋‹ค. ๐ŸŒŸ HYBE IM์˜ ์กฐ์˜์กฐ๋‹˜์ดโ€ฆ
ํ˜น์‹œ ์ด๋ฒˆ์ฃผ ๊ธˆ์š”์ผ์— ์ง„ํ–‰๋˜๋Š” ์˜คํ”„๋ผ์ธ ๋ฐ‹์—…์— ํ›„์›(์ƒŒ๋“œ์œ„์น˜/์ปคํ”ผ ๊ตฌ๋งค๋น„)์„ ํ•ด์ฃผ์‹ค ์ˆ˜ ์žˆ๋Š” ํŒ€์ด ์žˆ์œผ์‹ค๊นŒ์š”~?

๊ธฐ์—…, VC ์ค‘์—์„œ ๊ฐ„๋‹จํžˆ ํ›„์›ํ•ด์ฃผ์‹ค ์ˆ˜ ์žˆ๋Š” ๋ถ„๋“ค์€ DM์œผ๋กœ ํ›„์› ๋ฌธ์˜ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค. ํ›„์›๊ธฐ์—… ๋กœ๊ณ ๋Š” ํ–‰์‚ฌ์—์„œ ์‚ฌ์šฉ๋  ์žฅํ‘œ ํ•˜๋‹จ์— ๋…ธ์ถœ๋˜๊ณ  ํ–‰์‚ฌ ์‹œ์ž‘ ์ „ํ›„๋กœ ํ•ด๋‹น ํ›„์›์‚ฌ์‹ค์— ๋Œ€ํ•ด์„œ ๊ณต์œ ๋“œ๋ฆฝ๋‹ˆ๋‹ค. ๊ด€์‹ฌ์žˆ์œผ์‹  ๋ถ„๋“ค์€ @MatthewMinseokKim์œผ๋กœ ๋ฉ”์„ธ์ง€ ์ฃผ์„ธ์š”!
Continuous Learning_Startup & Investment
ํ˜น์‹œ ์ด๋ฒˆ์ฃผ ๊ธˆ์š”์ผ์— ์ง„ํ–‰๋˜๋Š” ์˜คํ”„๋ผ์ธ ๋ฐ‹์—…์— ํ›„์›(์ƒŒ๋“œ์œ„์น˜/์ปคํ”ผ ๊ตฌ๋งค๋น„)์„ ํ•ด์ฃผ์‹ค ์ˆ˜ ์žˆ๋Š” ํŒ€์ด ์žˆ์œผ์‹ค๊นŒ์š”~? ๊ธฐ์—…, VC ์ค‘์—์„œ ๊ฐ„๋‹จํžˆ ํ›„์›ํ•ด์ฃผ์‹ค ์ˆ˜ ์žˆ๋Š” ๋ถ„๋“ค์€ DM์œผ๋กœ ํ›„์› ๋ฌธ์˜ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค. ํ›„์›๊ธฐ์—… ๋กœ๊ณ ๋Š” ํ–‰์‚ฌ์—์„œ ์‚ฌ์šฉ๋  ์žฅํ‘œ ํ•˜๋‹จ์— ๋…ธ์ถœ๋˜๊ณ  ํ–‰์‚ฌ ์‹œ์ž‘ ์ „ํ›„๋กœ ํ•ด๋‹น ํ›„์›์‚ฌ์‹ค์— ๋Œ€ํ•ด์„œ ๊ณต์œ ๋“œ๋ฆฝ๋‹ˆ๋‹ค. ๊ด€์‹ฌ์žˆ์œผ์‹  ๋ถ„๋“ค์€ @MatthewMinseokKim์œผ๋กœ ๋ฉ”์„ธ์ง€ ์ฃผ์„ธ์š”!
LOVO (https://lovo.ai/) ํŒ€์—์„œ ์ด๋ฒˆ ๋ฐ‹์—…์€ ํ›„์›ํ•ด์ฃผ์‹ ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค :โค๏ธ
ํ›„์›์€ ํ•ญ์ƒ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค โค๏ธ


LOVOํŒ€์—์„œ data scientist๋ž‘ MLOps ๊ฐœ๋ฐœ์ž๋ฅผ ์ ๊ทน ์ฑ„์šฉํ•˜๊ณ  ๊ณ„์‹ ๋‹ค๊ณ  ํ•˜๋‹ˆ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์ฑ„์šฉ ํŽ˜์ด์ง€๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์…”์š” ๐Ÿค—
https://orbisailovo.notion.site/LOVO-db490c88a5384f778e913c614b7f6530
๐Ÿ‘1
OpenAI๊ฐ€ ํŒŒ์ธํŠœ๋‹๋œ ๋ชจ๋ธ์„ ์‚ฌ๊ณ  ํŒ” ์ˆ˜ ์žˆ๋Š” ๋งˆ์ผ“ํ”Œ๋ ˆ์ด์Šค๋ฅผ ์ค€๋น„์ค‘์ด๋ผ๋Š” ๋””์ธํฌ๋ฉ”์ด์…˜์˜ ๊ธฐ์‚ฌ.
ํ”Œ๋Ÿฌ๊ทธ์ธ๋ณด๋‹ค ํ›จ์”ฌ ๊ฐ•๋ ฅํ•  ์ˆ˜ ์žˆ๊ฒ ๋‹ค๋Š” ์ƒ๊ฐ์ด ๋“œ๋„ค์š”.

https://www.theinformation.com/articles/openai-considers-creating-an-app-store-for-ai-software?rc=jfxtml
Are we at the beginning of a new era of small models? Here is our newest LLM trained fully in my team at Microsoft Research:

*phi-1 achieves 51% on HumanEval w. only 1.3B parameters & 7B tokens training dataset*

Any other >50% HumanEval model is >1000x bigger (e.g., WizardCoder from last week is 10x in model size and 100x in dataset size).

How did we achieve this? It can be summarized in 5 words:

*Textbooks Are All You Need*

https://lnkd.in/gFUJaafT
Continuous Learning_Startup & Investment
Are we at the beginning of a new era of small models? Here is our newest LLM trained fully in my team at Microsoft Research: *phi-1 achieves 51% on HumanEval w. only 1.3B parameters & 7B tokens training dataset* Any other >50% HumanEval model is >1000x biggerโ€ฆ
Can small, custom LLMs do the job? Another controversial, amazing paper, this time from MSFT Research. What's the secret--textbook quality data.

They describe phi-1, a new large language model specifically for python coding that only has only 1.3B parameters, is trained with only 7B tokens, and claims to achieve nearly SOTA accuracy on the Human-Eval benchmark. They also claim that it "displays surprising emergent properties" after it is finetuned:

"We hypothesize that such high-quality data dramatically improves the learning efficiency of language models for code as they provide clear, self-contained, instructive, and balanced examples of coding concepts and skills"

Notice that while phi-1 does seem to perform well in evaluations, it is still a research model. It has trouble with variations in its prompts, and does not deal well with longer prompts. It's not going to compete with StarCoder or ChatGPT, so don't expect to make a new Flask app with it.

I could not find the model so I can't evaluate it myself; if anyone knows how or does, please post it in the comments.

It seems that, like the Falcon models, having great data lets you do great things.

"Textbooks Are All You Need:" https://lnkd.in/g8YdiWMP
This time, Ralph Clark and I planned for a get together with our better halves, and got a chance to reminisce old times, and catch up on family and friends. Lots of wine too.

์ด ์นœ๊ตฌ๋Š” ์•Œํ† ์Šค์—์„œ 1996๋…„์— ์ฒซ ํˆฌ์žํ•œ ํšŒ์‚ฌ ์žฌ๋ฌด๋ฅผ ๋งก์œผ๋ฉด์„œ ์ธ์—ฐ์ด ์‹œ์ž‘๋˜์—ˆ๊ณ ... ๊ทธํ›„ ์šฐ๋ฆฌ๊ฐ€ ํˆฌ์žํ•œ ๋‘๊ฐœ ํšŒ์‚ฌ ์žฌ๋ฌด/๋Œ€ํ‘œ๋ฅผ ๋งก์œผ๋ฉด์„œ ๊ณ„์† ์ด์–ด๊ฐ”๋‹ค. ์ง€๊ธˆ์€ (์šฐ๋ฆฌ๊ฐ€ ํˆฌ์ž ํ•˜์ง€ ์•Š์€) ์ƒ์žฅํšŒ์‚ฌ ๋Œ€ํ‘œ๋กœ์„œ ์‹œ์• ํ‹€๋กœ ์ด์‚ฌ๊ฐ€์„œ ์˜ค๋žซ๋™์•ˆ ๋ชป๋ณด๋˜ ์‚ฌ์ด์˜€๋Š”๋ฐ ์ง€๋‚œ์ฃผ ์šฐ์—ฐํžˆ ๊ธธ๊ฑฐ๋ฆฌ์—์„œ ๋งˆ์ฃผ์ณ์„œ ์ €๋…์„ ๊ฐ™์ด ํ•˜๊ฒŒ ๋˜์—ˆ๋‹ค.

์˜ค๋žœ ์ด์•ผ๊ธฐ ๋‚˜๋ˆด๋Š”๋ฐ... ๊ทธ์ค‘ ๊ณต๊ฐ ๊นŠ์—ˆ๋˜ ๊ฒƒ์€:

์–ด๋ฆฐ ๋‚˜์ด์— ์–ด์„คํ”„๊ฒŒ ์„ฑ๊ณตํ•˜์ง€ ์•Š์€๊ฒŒ ๋„ˆ๋ฌด ๋‹คํ–‰ ์ด์˜€๋‹ค. ๊ดœํžˆ ๋‚ด๊ฐ€ ๋›ฐ์–ด๋‚˜์„œ ์„ฑ๊ณตํ–ˆ๋‹ค๊ณ  ์ฐฉ๊ฐํ•˜๊ณ  ๊ฐ™์€ ์„ฑ๊ณต์„ ํ• ๊ฑฐ๋ผ ๊ธฐ๋Œ€ํ•˜๋ฉด์„œ ์ง€๋‚ธ ์‚ฌ๋žŒ๋“ค์€ ๋ถˆ์Œํ•ด ๋ณด์ธ๋‹ค. ์šด ์ข‹์•˜๋‹ค ์ƒ๊ฐํ•˜๊ณ  ์ฐฉ์‹คํ•˜๊ฒŒ ๋…ธ๋ ฅํ•˜๋Š” ์‚ฌ๋žŒ๋“ค์€ ์ž˜ ํ•˜๊ณ  ์žˆ๋”๋ผ.

(์ฐธ๊ณ ๋กœ ์ด๋ถ„์ด ์šฐ๋ฆฌ๋ž‘ ๊ฐ™์ด ํ•œ ์ฒซํšŒ์‚ฌ๋Š” ์ƒ์žฅํ•ด์„œ ์กฐ๋‹จ์œ„ ํšŒ์‚ฌ๋กœ ๊ฐ”๋‹ค๊ฐ€ ๋ฒ„๋ธ”์ด ๊บผ์ง€๋ฉด์„œ ๋งํ–ˆ๊ณ ... ๋‘๋ฒˆ์งธ ํšŒ์‚ฌ๋„ ํˆฌ์ž๊ธˆ ํšŒ์ˆ˜๋„ ๋ชปํ• ์ •๋„ ๊ฐ€๊ฒฉ์— ํŒ”๋ ธ๊ณ ...์„ธ๋ฒˆ์งธ ํšŒ์‚ฌ๋Š” ์ข‹์€ ๊ฐ€๊ฒฉ์œผ๋กœ ๋งค๊ฐ ํ–ˆ์—ˆ๋‹ค. ํšŒ์‚ฌ๊ฐ€ ๋งํ•ด๋„ (์†์‹ค ๋‚˜๋„) ์„œ๋กœ์—๊ฒŒ ์‹ ๋ขฐ๋ฅผ ์ฃผ๋ฉด ์ด๋ ‡๊ฒŒ ์ธ์—ฐ์ด ๊ณ„์† ๋œ๋‹ค.)
โค1
์˜คํ”ˆ์†Œ์Šค๊ฐ€ ์˜์•„์˜ฌ๋ฆฐ ์ž‘์€ ๊ณต - SAM.

Meta๊ฐ€ ์š”์ฆ˜ ๊ณ„์† ์˜คํ”ˆ์†Œ์Šค๋กœ ์žฌ๋ฏธ๋ฅผ ๋ณด๊ณ  ์žˆ๋Š” ๊ฒƒ ๊ฐ™์€๋ฐ, LlaMa ์ด์™ธ์—๋„ SAM๋„ ํญ๋ฐœ์ ์œผ๋กœ ํ™•์‚ฐ๋˜๊ณ  ์žˆ๋„ค์š”.

๊น€์ง„์„ฑ ๊ต์ˆ˜๋‹˜์ด Segment Anything Model (SAM) for Radiation Oncology ๋…ผ๋ฌธ์„ ์†Œ๊ฐœํ•ด์ฃผ์…”์„œ ์ด ์ฐธ์— ์ž ์‹œ ์ฐพ์•„๋ณด๋ฉด์„œ ๊นœ์ง ๋†€๋ž๋„ค์š”. 4์›”5์ผ Meat์—์„œ SAM์„ ๋ฐœํ‘œํ•œ ์ดํ›„๋กœ github ๋ณ„ํ‘œ๋Š” ๋ฒŒ์จ 3.5๋งŒ๊ฐœ๋ฅผ ๋„˜์–ด์„ฐ๊ณ  arXiv ๋…ผ๋ฌธ๋“ค๋„ ์–ด๋งˆ์–ด๋งˆํ•˜๋‹ค๋Š”.

๊ทธ์ค‘์—์„œ๋„ ์˜๋ฃŒ์˜์ƒ ๋ถ„ํ•  ์ชฝ๋งŒํ•ด๋„ ์ œ๋ฒ• ๋˜๊ณ  ์žˆ๊ณ , SAM ๊ด€๋ จ ์„œ๋ฒ ์ด ๋…ผ๋ฌธ๋“ค์€ ๊ณ„์† ์Ÿ์•„์ ธ ๋‚˜์˜ค๊ณ , ๋ชฉ๋ก์„ ์ •๋ฆฌํ•˜๊ณ  ์žˆ๋Š” github ๋ฆฌํฌ๋“ค๋„ ๋งŽ๋”๊ตฐ์š”. ์ด๋ฏธ ์–ด๋А ์ •๋„ ์ƒํƒœ๊ณ„๋ฅผ ๊ตณํ˜”๋‹ค๊ณ  ๋งํ•ด๋„ ๋  ๊ฒƒ ๊ฐ™๋„ค์š”.

์ž ๊น 20๋ถ„ ์ •๋„ ์ฐพ์€ ๊ฒƒ๋“ค๋งŒํ•ด๋„ ์ด ์ •๋„ ๋งํฌ๋“ค์ด๋‹ˆ. ์ •๋ง ์˜คํ”ˆ์†Œ์Šค์˜ ํž˜์ด๋ž€ ..... #SAM

Awesome Segment Anything
https://github.com/Hedlen/awesome-segment-anything

Segment Anything Model (SAM) for Medical Image Segmentation.
https://github.com/YichiZhang98/SAM4MIS

Segment Anything Model (SAM) for Radiation Oncology
https://arxiv.org/abs/2306.11730

Segment Anything
https://arxiv.org/abs/2304.02643

Segment Anything Model for Medical Image Analysis: an Experimental Study
https://arxiv.org/abs/2304.10517

Segment Anything in Medical Images
https://arxiv.org/abs/2304.12306

SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and More
https://arxiv.org/abs/2304.09148

SAM.MD: Zero-shot medical image segmentation capabilities of the Segment Anything Model
https://arxiv.org/abs/2304.05396

When SAM Meets Medical Images: An Investigation of Segment Anything Model (SAM) on Multi-phase Liver Tumor Segmentation
https://arxiv.org/abs/2304.08506

Segment Anything Model for Medical Images?
https://arxiv.org/abs/2304.14660

SAMM (Segment Any Medical Model): A 3D Slicer Integration to SAM
https://arxiv.org/abs/2304.05622

SAM on Medical Images: A Comprehensive Study on Three Prompt Modes
https://arxiv.org/abs/2305.00035

Computer-Vision Benchmark Segment-Anything Model (SAM) in Medical Images: Accuracy in 12 Datasets
https://arxiv.org/abs/2304.09324

Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation
https://arxiv.org/abs/2304.12620

Zero-shot performance of the Segment Anything Model (SAM) in 2D medical imaging: A comprehensive evaluation and practical guidelines
https://arxiv.org/abs/2305.00109

Personalize Segment Anything Model with One Shot
https://arxiv.org/abs/2305.03048

How Segment Anything Model (SAM) Boost Medical Image Segmentation?
https://arxiv.org/abs/2305.03678

Customized Segment Anything Model for Medical Image Segmentation
https://arxiv.org/abs/2304.13785

Segment Anything Model (SAM) Enhanced Pseudo Labels for Weakly Supervised Semantic Segmentation
https://arxiv.org/abs/2305.05803

Segment Anything Model (SAM) Meets Glass: Mirror and Transparent Objects Cannot Be Easily Detected
https://arxiv.org/abs/2305.00278

Segment Anything in High Quality
https://arxiv.org/abs/2306.01567

Segment Anything Model (SAM) for Digital Pathology: Assess Zero-shot Segmentation on Whole Slide Imaging
https://arxiv.org/abs/2304.04155

SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model
https://arxiv.org/abs/2306.02245

DeSAM: Decoupling Segment Anything Model for Generalizable Medical Image Segmentation
https://arxiv.org/abs/2306.00499

A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets Prompt Engineering
https://arxiv.org/abs/2306.06211

A Comprehensive Survey on Segment Anything Model for Vision and Beyond
https://arxiv.org/abs/2305.08196
Continuous Learning_Startup & Investment
Are we at the beginning of a new era of small models? Here is our newest LLM trained fully in my team at Microsoft Research: *phi-1 achieves 51% on HumanEval w. only 1.3B parameters & 7B tokens training dataset* Any other >50% HumanEval model is >1000x biggerโ€ฆ
"Textbooks Are All You Need" is making rounds:
twitter.com/SebastienBubecโ€ฆ
reminding me of my earlier tweet :). TinyStories is also an inspiring read:
twitter.com/EldanRonen/staโ€ฆ
We'll probably see a lot more creative "scaling down" work: prioritizing data quality and diversity over quantity, a lot more synthetic data generation, and small but highly capable expert models.
ํŒŒ๋ผ๋ฏธํ„ฐ ํšจ์œจ์  ํŒŒ์ธํŠœ๋‹ ๊ธฐ๋ฒ• LoRA(Low-Rank Adaptation of Large Language Models)๋Š” ๊ฐ„๋ช…ํ•œ ๊ตฌ์กฐ, ์šฐ์ˆ˜ํ•œ ํšจ๊ณผ, ์œ ์—ฐํ•œ ํ™•์žฅ์„ฑ์œผ๋กœ ์ธํ•ด ํฌ๊ฒŒ ์ฃผ๋ชฉ๋ฐ›์•˜๋‹ค. ์ด๋Š” ์œ ์ถœ๋œ ๊ตฌ๊ธ€์˜ ๋‚ด๋ถ€ ๋ฌธ๊ฑด 'We Have No Moat, and Neither Does OpenAI'์—์„œ๋„ ์–ธ๊ธ‰๋œ ์‚ฌ์‹ค์ด๋‹ค.

LoRA๋Š” ์–ธ์–ด ๋ชจ๋ธ์—์„œ ์‹œ์ž‘๋์ง€๋งŒ ํ–‰๋ ฌ (๊ทธ๋ฆฌ๊ณ  ๋‹น์—ฐํžˆ ๋‹ค์ฐจ์› ํ…์„œ๊นŒ์ง€) ํ•™์Šต์ด ์กด์žฌํ•˜๋Š” ๊ณณ์ด๋ผ๋ฉด ์–ด๋””๋“  ์ ์šฉ์ด ๊ฐ€๋Šฅํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๋ฏธ์ง€ ์ƒ์„ฑ์„ ์œ„ํ•œ ๋””ํ“จ์ „ ๋ชจ๋ธ์—๋„ ์ ์šฉ๋˜๊ธฐ ์‹œ์ž‘ํ–ˆ๋‹ค. (Shimo Ryu๊ฐ€ ์Šคํ…Œ์ด๋ธ” ๋””ํ“จ์ „์— ์ตœ์ดˆ๋กœ ์ ‘๋ชฉ์‹œ์ผฐ๋‹ค.)

LoRA ๊ตฌ์กฐ์˜ ๋‹จ์ˆœ์„ฑ ๋•Œ๋ฌธ์— ๋‹ค์–‘ํ•œ ๋ณ€์ข…์ด ํƒœ๋™ํ•˜์ง€ ์•Š์„๊นŒ ์ƒ๊ฐํ–ˆ๋Š”๋ฐ ์•„๋‹ˆ๋‚˜ ๋‹ค๋ฅผ๊นŒ, ์ด๋ฏธ์ง€ ์ƒ์„ฑ ์”ฌ์—์„œ๋Š” LoCon, LoHa ๊ฐ™์€ ๊ธฐ๋ฒ•์ด ์œ ํ–‰ ์ค‘์ด๋‹ค. LoCon๋Š” ํ•ฉ์„ฑ๊ณฑ ๋ ˆ์ด์–ด์— ๋‹จ์ˆœํžˆ LoRA๋ฅผ ํ™•์žฅ ์ ์šฉํ•œ ๊ฒƒ์ธ ๋ฐ˜๋ฉด, LoHa๋Š” ํŠน๊ธฐํ•  ๋งŒํ•˜๋‹ค.

LoHa(LoRA with Hadamard Product Representation)๋Š” Kohaku-Blueleaf๊ฐ€ ์Šคํ…Œ์ด๋ธ” ๋””ํ“จ์ „ ์›น UI ์ƒํƒœ๊ณ„์— ๊ฐ€์ ธ์˜จ ๊ฒƒ์œผ๋กœ ์‚ฌ์‹ค ๋น„๊ณต์‹ ๋ช…์นญ์ด๋‹ค. ์ด ๊ตฌํ˜„๋ฌผ์€ 2021๋…„ ํฌ์Šคํ… ๋…ผ๋ฌธ 'FedPara: Low-Rank Hadamard Product for Communication-Efficient Federated Learning'(https://arxiv.org/abs/2108.06098)์— ๊ทผ๊ฑฐํ•œ๋‹ค.

๋…ผ๋ฌธ์˜ ํ•ต์‹ฌ ์•„์ด๋””์–ด๋ฅผ ์งง๊ฒŒ ์–ธ๊ธ‰ํ•˜์ž๋ฉด, ๋ชจ๋ธ ๊ฐ€์ค‘์น˜๋ฅผ ๋‘ ๊ฐœ์˜ ์ €์ฐจ์› ํ–‰๋ ฌ ๊ณฑ์œผ๋กœ ๋ถ„ํ•ดํ•˜๋Š” ๊ธฐ์กด ๋ฐฉ์‹(LoRA) ๋Œ€์‹ , ์ €์ฐจ์› ํ–‰๋ ฌ ๊ณฑ ๋‘ ์Œ์˜ ์•„๋งˆ๋‹ค๋ฅด ๊ณฑ(Hadamard Product)์œผ๋กœ ๋ถ„ํ•ดํ•˜์—ฌ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด๋Ÿฌ๋ฉด ๋ญ๊ฐ€ ์ข‹์„๊นŒ? LoRA๋กœ ์žฌ๊ตฌ์„ฑํ•œ ํ–‰๋ ฌ์˜ ๊ณ„์ˆ˜(Rank)๊ฐ€ (์ ์–ด๋„) R์ด๋ผ๋ฉด ๊ฐ™์€ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐœ์ˆ˜๋กœ LoHa ๋ฐฉ์‹์€ R ์ œ๊ณฑ ์ด์ƒ์ธ ๊ณ„์ˆ˜๋ฅผ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋‹ˆ๊นŒ ๋™์ผํ•œ ๋ชจ๋ธ ํฌ๊ธฐ๋กœ ๋” ๋†’์€ ํ‘œํ˜„๋ ฅ์„ ํ™•๋ณดํ•˜๊ฒŒ ๋˜๋Š” ๊ฒƒ์ด๋‹ค. ๋ฌผ๋ก  ์•ฝ๊ฐ„์˜ ์—ฐ์‚ฐ ๋น„์šฉ์ด ๋” ์ถ”๊ฐ€๋˜์ง€๋งŒ ๋ง์ด๋‹ค.

์ด๋ฏธ์ง€ ์ƒ์„ฑ ์”ฌ์—์„œ๋Š” LoHa๊ฐ€ LoRA๋ณด๋‹ค ์Šคํƒ€์ผ ํ•™์Šต์— ๋” ํšจ๊ณผ์ ์ด๋ผ๋Š” ํ‰์ด ๋งŽ์ง€๋งŒ ์—ฌ์ „ํžˆ ํ•™์ˆ ์ ์œผ๋กœ ๊ฒ€์ฆ๋œ ๋ฐ” ์—†๋‹ค. ์žฌ๋ฐŒ๋Š” ๊ฑด LoHa๋ผ๊ณ  ๋ถˆ๋ฆฌ๋Š” ์ด FedPara ๊ธฐ๋ฒ•์€ ์ƒ์„ฑํ˜• AI์™€ ๋ฌด๊ด€ํ•˜๊ฒŒ ์—ฐํ•ฉ ํ•™์Šต์˜ ๋งฅ๋ฝ์—์„œ ๋งŒ๋“ค์–ด์กŒ๋‹ค๋Š” ์ ์ด๋‹ค. ๋””๋ฐ”์ด์Šค์˜ ํ†ต์‹ ๋Ÿ‰(ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐœ์ˆ˜)์„ ์ค„์ด๋ฉด์„œ ๋ชจ๋ธ์˜ ํ‘œํ˜„๋ ฅ์„ ์ตœ๋Œ€ํ•œ ์žƒ์ง€ ์•Š๊ธฐ ์œ„ํ•ด ๊ณ ์•ˆ๋œ ๋ฐฉ๋ฒ•์ด๋‹ค.
์•ˆ๋…•ํ•˜์„ธ์š”. ์˜ค๋žœ๋งŒ์— ํ•€๋‹ค ํ…Œํฌ ๋ธ”๋กœ๊ทธ ๊ธ€์ด ์˜ฌ๋ผ์™€์„œ ๊ณต์œ  ํ•ฉ๋‹ˆ๋‹ค.
ํšŒ์‚ฌ์˜ ์กฐ์ง์„ ๊ตฌ์„ฑํ• ๋•Œ, ๊ธฐ๋Šฅ ์ค‘์‹ฌ ์กฐ์ง/ํ”„๋กœ๋•ํŠธ ์ค‘์‹ฌ ์กฐ์ง์œผ๋กœ ๊ตฌ์„ฑํ•˜๋Š”๋ฐ
ํ•€๋‹ค๋Š” 2๋ฒˆ์งธ ํ”„๋กœ๋•ํŠธ ์ค‘์‹ฌ ์กฐ์ง ์ž…๋‹ˆ๋‹ค.
๊ตฌ์„ฑ์› ๋ชจ๋‘๊ฐ€ ํ”„๋กœ๋•ํŠธ(์„œ๋น„์Šค)์˜ ์ฃผ์ธ์ด ๋˜์–ด ์›ํŒ€์œผ๋กœ ๋‹ด๋‹นํ•œ ์„œ๋น„์Šค์˜ ๊ณ ๊ฐ์„ ๋งŒ์กฑ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ์ง€์†์ ์ธ ์—…๋ฌด๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
์„œ๋น„์Šค์˜ ๊ฐœ์„ ๊ณผ ๋น ๋ฅธ ์˜์‚ฌ๊ฒฐ์ •์„ ์œ„ํ•ด ํ”„๋กœ๋•ํŠธํŒ€ ๋‚ด ๊ต์ฐจ๊ธฐ๋Šฅ(PO, ๋””์ž์ด๋„ˆ, ๊ฐœ๋ฐœ์ž BE/FE) ์ธ์›๋“ค์ด ์—…๋ฌด๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๋น ๋ฅด๊ณ  ์ฃผ๊ธฐ์ ์œผ๋กœ ์Šค์Šค๋กœ์˜ ์—…๋ฌด๋ฅผ ํšŒ๊ณ ํ•˜๋ฉฐ ๊ฐœ์„ ํ•˜๋Š” "Empirical Process"๋ฅผ ๋”ฐ๋ฅด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์• ์ž์ผ ์กฐ์ง์—์„œ ์ฑ„์šฉํ•˜๋Š” ์Šคํฌ๋Ÿผ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ฐ ํ”„๋กœ๋•ํŠธ ์กฐ์ง์—์„œ ์กฐ๊ธˆ์”ฉ ์œ ์—ฐํ•˜๊ฒŒ ๋”ฐ๋ฅด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐœ์ธ์ ์œผ๋กœ ์ผํ•˜๋Š” ๋ฐฉ์‹์ด ๋งค์šฐ ์งœ์ž„์ƒˆ ์žˆ๊ฒŒ ๊ตฌ์„ฑ๋œ ํ”„๋กœ๋•ํŠธ ์กฐ์ง์ด ์žˆ๋Š”๋ฐ, ํ•ด๋‹น ์กฐ์ง์˜ ์‹œ๋‹ˆ์–ด ๊ฐœ๋ฐœ์ž์ด์‹  Hyeong Rae Kim๋‹˜ ์ž‘์„ฑํ•˜์‹  "์ผํ•˜๋Š” ๋ฐฉ์‹"์˜ ๊ธ€์ž…๋‹ˆ๋‹ค. ์ถ”์ฒœํ•ฉ๋‹ˆ๋‹ค.
ํ•ญ์ƒ "Product Principle"์„ ์ผ๊นจ์›Œ ์ฃผ์‹œ๋Š” ์ตœ์„ฑํ˜ธ ๋Œ€ํ‘œ๋‹˜๊ป˜ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

https://medium.com/finda-tech/%EC%9A%B0%EB%A6%AC%EC%9D%98-%EA%B0%9C%EB%B0%9C%EB%AC%B8%ED%99%94%EB%8A%94-%EC%9D%B4%EB%A0%87%EA%B2%8C-%EC%84%B1%EC%9E%A5%ED%95%A9%EB%8B%88%EB%8B%A4-8f57b06ca549
โค1
https://youtu.be/QWvrCuuFsjg?list=PLlrxD0HtieHjolPmqWVyk446uLMPWo4oP
Nvidia๋„ ํšŒ์‚ฌ๊ฐ€ ๋ณด์œ ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๊ฐ€๊ณตํ•ด์„œ ๋Œ€ํ˜• ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๊ฑฐ๋‚˜ ํ˜น์€ ๋Œ€ํ˜• ๋ชจ๋ธ์„ Finetunning์‹œ์ผœ์ฃผ๋Š” ์ธํ”„๋ผ์ชฝ์„ ์ค€๋น„ํ•˜๊ณ  ์žˆ๋„ค์š”. AWS, Google, MS ๋“ฑ ๋น…ํ…Œํฌ ๋ฟ๋งŒ์•„๋‹ˆ๋ผ Mosaic ML ๋“ฑ ์—ฌ๋Ÿฌ ์Šคํƒ€ํŠธ์—… ํ”Œ๋ ˆ์ด์–ด๋“ค๋„ ์ด ์‹œ์žฅ์„ ๋ณด๊ณ  ์žˆ์„ํ…๋ฐ ์ข€ ๋” ์‚ดํŽด๋ด์•ผ๊ฒ ๋„ค์š” ใ…Žใ…Ž
This is the way how we can serve personalized service by leveraging AI.

Data, Model and engineering to combine those two will be important.