----
Navigating "corporate speak" isn't easy.
Here's a helpful guide I put together:
"Let me check with my team" = No
"Possibly" = No
"On my roadmap" = Not happening
"This will be done in Q4" = This will be done in Q2 next year
"Disagree and commit" = I hate you
"Per my last email" = Try reading, for once in your life
"Challenging landscape" = We're going out of business, quickly
"Digital transformation" = We're going out of business, slowly
"Let's circle back" = We'll never speak of this again
"Take it offline" = We'll never speak of this again
"30,000 foot view" = I don't know what I'm saying
"Low hanging fruit" = Easy promotion
"Open up the kimono" = HR violation
"We use AI" = We don't use AI
"We use machine learning" = We don't use machine learning
"All hands on deck" = Let's actually try for once, please
๐
Navigating "corporate speak" isn't easy.
Here's a helpful guide I put together:
"Let me check with my team" = No
"Possibly" = No
"On my roadmap" = Not happening
"This will be done in Q4" = This will be done in Q2 next year
"Disagree and commit" = I hate you
"Per my last email" = Try reading, for once in your life
"Challenging landscape" = We're going out of business, quickly
"Digital transformation" = We're going out of business, slowly
"Let's circle back" = We'll never speak of this again
"Take it offline" = We'll never speak of this again
"30,000 foot view" = I don't know what I'm saying
"Low hanging fruit" = Easy promotion
"Open up the kimono" = HR violation
"We use AI" = We don't use AI
"We use machine learning" = We don't use machine learning
"All hands on deck" = Let's actually try for once, please
๐
๐1
์์ดํฐ ์ถ์ ํ ์ฒซ 4~5๋
์ ๋๋ ์ฑ์ ์ถ์ํ๋ ๊ฒ๋ง์ผ๋ก๋ ์ ์ ์ wow๋ฅผ ์ด๋์ด ๋ผ ์ ์์๋ ์์ ์ด ์์๋ค. ์ด๋ํ๋ฉฐ ์ธํฐ๋ท๋ ํ ์ ์๊ณ ํฐ์น์คํฌ๋ฆฐ์ ํ์ฉํ ์ ์๋ ์ค๋งํธํฐ์์ ๊ตฌ๋์ด ๋๋ค๋ ๊ฒ๋ง์ผ๋ก๋ wow ์๋ ๊ฒ์ด๋ค.
๊ทธ ์ดํ ๊ธฐ์กด ๋ ๊ฐ์ ์ฐ์ ์ ํจ๋ฌ๋ค์์ ๋ฐ๊พผ ์๋น์ค๋ค์ด ๋ฑ์ฅํ์ฌ wow ๋ฅผ ์ ์ฌํ๋ค. ์ ํฌ๋ธ/๋ทํ๋ฆญ์ค๋ TV/๋ฏธ๋์ด ์ฐ์ ์ ๋ฐ๊ฟจ๊ณ , ํ๋ถ/๊ตฌ๊ธ์ ๋ฏธ๋์ด/ํฌํธ์ฐ์ ์ ๋ฐ๊ฟ๋์์ผ๋ฉฐ, ์๋ง์กด์ ์ผํ/์๋ฒ์์คํ ์, ํ ์ฌ๋ผ๋ ์๋์ฐจ ์ฐ์ ์, ์ฐ๋ฒ๋ ํ์ ์ฐ์ ์, ์์ด๋น์๋น๋ฅผ ํธํ ์ฐ์ ์ ํฐ ์ํฅ์ ์คฌ๋ค.
์์ ๊ฐ์ 2000๋ ์ด๋ฐ ์ดํ ๊ฒฉ๋์ 20๋ ์ ๋๋ฅผ ๊ฒช์ผ๋ฉฐ, ํ๋์ฌํ์ ๋ง์ ์ฌ๋๋ค์ ๋ชจ๋ฐ์ผ ๋ฐ ์น ์ค์ฌ์ ์๋น์ค์ ๊ฝค ๋ง์ด ์ต์ํด์ง ๋ฏ ํ๋ค. ๊ทธ๋์, ์๋ก ์ถ์๋๋ ์ฑ์ ๋ด๋ ๊ทธ ๊ฐํฅ์ด ๊ณผ๊ฑฐ ๋๋น ๋ฌด๋์ง๊ณ ์์์ ๋๋๋ค.
์คํ๋ ค, ์์ฆ์ ๊ณผ๊ฑฐ์๋ ๋ง์กฑํ๋ฉฐ ์ฌ์ฉํ๋ Big Tech ์ฌ์ ์๋น์ค ์กฐ์ฐจ๋ ์ฌ์ฌ '์ง๊ฒจ์์ง๋ค' ๋๋ผ๋ ๋ถ๋ค์ด ์ฆ๊ฐํ๋ ๋ฏํ๋ค. ๋์์ '๊ณผ๊ฑฐ๋ถํฐ ๋ง์ด ์ด์ฉํ๋ ์๋น์ค๋ค์ด ์ต๊ทผ์๋ ์์ ๋ง ๋ชปํด์ ์ค๋ง๊ฐ์ด ํฌ๊ณ , ์ด์ฉ ์ ์์ด ์ด์ฉํ๊ณ ์๋ค'๊ณ ๋งํ๋ ๋ถ๋ค๋ ๋ง์์ง๋ ๋ฏํ๋ค. OpenAI ๊ฐ chatGPT ๋ฅผ ์ถ์ํ๋ฉฐ ์๋ก์ด ํ๋ ฅ์ ๋ถ์ด๋ฃ๊ณ ์์ง๋ง, ๊ทธ ํ๊ธ๋ ฅ์ด ๊ณผ๊ฑฐ์ ์์ดํฐ ๋ชจ๋ฉํธ์ฒ๋ผ ์์ฒญ๋์ง ์์ ๊ฒ๋ ์ฌ์ค์ด๋ค. (chatGPT๋ ์๋น์ ๋ณด๋ค๋ ์คํํธ์ ์ ํ๋ ฅ์ ๋ถ์ด๋ฃ๊ณ ์๋ ๋ฏํ๋ค. ๊ทธ๋์ ๊ฑฑ์ ์ด๊ธฐ๋ ํ๋ค. ์๋น์๊ฐ ๋ฐ์ํ์ง ์์ผ๋ฉด ๊ฒฐ๊ตญ imapct ๊ฐ ์ ํ์ ์ผ ์๋ฐ์ ์๊ธฐ ๋๋ฌธ์ด๋ค) ์์ฆ์ ์ ์ ๋ถ๋ค์ด ๋๋ค์ ์๋น์ค์ ๊ณผ๊ฑฐ ๋๋น ๋จ๋จ๋ฏธ์ง๊ทผํ ๋ฐ์์ ๋ณด์ด๋ ์๊ธฐ์ธ ๊ฒ ๊ฐ๊ธฐ๋ ํ๋ค.
๋ค๋ง, ์ ์ ๋ค์ ์ฌ์ ํ ๋ ๋์ ์ถ์ ์ด๊ณ ์ถ์ดํ๋ ์๊ตฌ๊ฐ ๊ฐํ๋ค. ๋ ์ฑ์ฅํ๊ณ ์ถ์ดํ๊ณ , ๋ ํ๋ณตํ ์ถ์ ์ด๊ณ ์ถ์ดํ๊ณ , ๋ ์ฆ๊ฑฐ์ด ์๊ฐ์ ๋ณด๋ด๊ณ ์ถ์ด ํ๋ค. ๋ณธ์ธ ์ธ์์ ๊ฐ๋ ฌํ๊ฒ ๋ค๊ฐ์ฌ ์ ์๋ ์๋น์ค๋ฅผ ๋ง๋, ํด๋น ์๋น์ค์ ํจ๊ป ๋ ๋ง์กฑ์ค๋ฌ์ด ์ถ์ ์ด๊ธฐ๋ฅผ ํฌ๋งํ๋ ๋ถ๋ค์ด ๊ฝค ์๋ค.
๊ทธ๋์, ๋ฏธ๋์๋ ์ด๋ค ์๋น์ค๊ฐ ๋ง์ ์ ์ ๋ก๋ถํฐ wow๋ฅผ ์ด๋์ด ๋ผ ์ ์์์ง... ๊ถ๊ธํ๋ค. ์ ์ ์ ์ฑ์ฅ๊ณผ ๋ง์กฑ์ ๋ ๋ง์ด ์ง์ฐฉํ๋ ์๋น์ค๊ฐ ๊ทธ wow ๋ฅผ ์ด๋์ด ๋ผ ์ ์์ผ๋ฆฌ๋ผ ๋ฏฟ์ง๋ง, ๊ณผ๊ฑฐ ๋๋น ์ ์ ์ wow ๋ฅผ ์ด๋์ด ๋ด๊ธฐ ์ํ ๋์ด๋๊ฐ ๋์์ ธ์ ๊ณผ๊ฑฐ ๋งํผ '๊ธ์ฑ์ฅํ๋ ์คํํธ์ '์ด ๋ง์ด ๋์ฌ ๊ฒ ๊ฐ์ง๋ ์๋ค. ๊ทธ๋๋, ์ด๋ฐ ๋์ธ๋ฅผ ์ด๊ฒจ๋ด๋ ๊ฐ์ฅ ์ข์ ๋ฐฉ๋ฒ์ '๋ ํ๋ค์ด์ง๋ ํ๊ฒฝ'์ ๋ฐ์ํ๊ธฐ ๋ณด๋ค๋ '์ ์ '์ ์ง์คํ์ฌ ํ๋ฃจ ํ๋ฃจ ๋ ์ข์ ์๋น์ค๋ฅผ ๋ง๋ค์ด ๋ด๋ ๊ฒ์ด๋ผ ์๊ฐํ๋ค. ๋ฌผ๋ก ๊ณผ๊ฑฐ์๋ 1๋ ๋ ธ๋ ฅํ๋ฉด ๋ง์กฑ์ ์ด๋์ด ๋ผ ์ ์์๋ค๋ฉด, ์ต๊ทผ์๋ 3~5๋ฐฐ๋ ๋ ํด์ผํ๋ ์ด๋ ค์์ด ์กด์ฌํ์ง๋ง ๋ง์ด๋ค.
๊ทธ๋์ ์์ฆ ๊ฐ์ ์๋๋ Why? ๊ฐ ๋ ์ค์ํด์ง๋ ๊ฒ ๊ฐ๋ค. ์ ์์ํ๋์ง? ๋ฌด์์ ์ํด ์๋น์ค๋ฅผ ๋ง๋ค์ด ๋ด๊ณ ์๋์ง?์ ๋ํ ๋ช ํํ๊ณ ์์งํ ๋ต์ ๊ฐ์ง๊ณ ์๋์ง๊ฐ ๋ ์ค์ํด์ง๋ ์๋๊ฐ ์๋ค. ๊ทธ ์ด์ ์ ๋ํ ๋ต์ ์ ์ฒด ํ์ด ๊ณต์ ํ๊ณ ์์ด์ผ, wow ๋ฅผ ๋ง๋ค์ด ๋ด๊ธฐ ์ํ ๋ ธ๋ ฅ/์๊ฐ์ด ์ฆ๊ฐํ ์๋๋ฅผ ๋ฌต๋ฌตํ ์ด๊ฒจ๋๊ฐ ์ ์๊ธฐ ๋๋ฌธ์ด๋ค.
์๋น์ค๋ฅผ ์ด์ด๋๊ฐ๊ณ ์๋ ๋ณธ์ง์ ์ด์ ๋ ๋ฌด์์ธ๊ฐ? ์ฐ๋ฆฌ ์๋น์ค๋ ๋๊ตฌ๋ฅผ ์ํด ์กด์ฌํ๋ฉฐ ์ ์กด์ฌํด์ผ ํ๋๊ฐ? ๊น์ ๋ฐค, ๋ง๊ธ์ ๋ ์ฑ์ฅ์์ผ ๋๊ฐ๊ธฐ ์ํ ๋ฐฉ๋ฒ์ ๊ณ ๋ฏผํ๋ค๊ฐ, ์๋ก ์ ์ธ ์ง๋ฌธ์ ๋ํด ๋ค์ ํ ๋ฒ ์๊ฐํด๋ณธ๋ค.
๊ทธ ์ดํ ๊ธฐ์กด ๋ ๊ฐ์ ์ฐ์ ์ ํจ๋ฌ๋ค์์ ๋ฐ๊พผ ์๋น์ค๋ค์ด ๋ฑ์ฅํ์ฌ wow ๋ฅผ ์ ์ฌํ๋ค. ์ ํฌ๋ธ/๋ทํ๋ฆญ์ค๋ TV/๋ฏธ๋์ด ์ฐ์ ์ ๋ฐ๊ฟจ๊ณ , ํ๋ถ/๊ตฌ๊ธ์ ๋ฏธ๋์ด/ํฌํธ์ฐ์ ์ ๋ฐ๊ฟ๋์์ผ๋ฉฐ, ์๋ง์กด์ ์ผํ/์๋ฒ์์คํ ์, ํ ์ฌ๋ผ๋ ์๋์ฐจ ์ฐ์ ์, ์ฐ๋ฒ๋ ํ์ ์ฐ์ ์, ์์ด๋น์๋น๋ฅผ ํธํ ์ฐ์ ์ ํฐ ์ํฅ์ ์คฌ๋ค.
์์ ๊ฐ์ 2000๋ ์ด๋ฐ ์ดํ ๊ฒฉ๋์ 20๋ ์ ๋๋ฅผ ๊ฒช์ผ๋ฉฐ, ํ๋์ฌํ์ ๋ง์ ์ฌ๋๋ค์ ๋ชจ๋ฐ์ผ ๋ฐ ์น ์ค์ฌ์ ์๋น์ค์ ๊ฝค ๋ง์ด ์ต์ํด์ง ๋ฏ ํ๋ค. ๊ทธ๋์, ์๋ก ์ถ์๋๋ ์ฑ์ ๋ด๋ ๊ทธ ๊ฐํฅ์ด ๊ณผ๊ฑฐ ๋๋น ๋ฌด๋์ง๊ณ ์์์ ๋๋๋ค.
์คํ๋ ค, ์์ฆ์ ๊ณผ๊ฑฐ์๋ ๋ง์กฑํ๋ฉฐ ์ฌ์ฉํ๋ Big Tech ์ฌ์ ์๋น์ค ์กฐ์ฐจ๋ ์ฌ์ฌ '์ง๊ฒจ์์ง๋ค' ๋๋ผ๋ ๋ถ๋ค์ด ์ฆ๊ฐํ๋ ๋ฏํ๋ค. ๋์์ '๊ณผ๊ฑฐ๋ถํฐ ๋ง์ด ์ด์ฉํ๋ ์๋น์ค๋ค์ด ์ต๊ทผ์๋ ์์ ๋ง ๋ชปํด์ ์ค๋ง๊ฐ์ด ํฌ๊ณ , ์ด์ฉ ์ ์์ด ์ด์ฉํ๊ณ ์๋ค'๊ณ ๋งํ๋ ๋ถ๋ค๋ ๋ง์์ง๋ ๋ฏํ๋ค. OpenAI ๊ฐ chatGPT ๋ฅผ ์ถ์ํ๋ฉฐ ์๋ก์ด ํ๋ ฅ์ ๋ถ์ด๋ฃ๊ณ ์์ง๋ง, ๊ทธ ํ๊ธ๋ ฅ์ด ๊ณผ๊ฑฐ์ ์์ดํฐ ๋ชจ๋ฉํธ์ฒ๋ผ ์์ฒญ๋์ง ์์ ๊ฒ๋ ์ฌ์ค์ด๋ค. (chatGPT๋ ์๋น์ ๋ณด๋ค๋ ์คํํธ์ ์ ํ๋ ฅ์ ๋ถ์ด๋ฃ๊ณ ์๋ ๋ฏํ๋ค. ๊ทธ๋์ ๊ฑฑ์ ์ด๊ธฐ๋ ํ๋ค. ์๋น์๊ฐ ๋ฐ์ํ์ง ์์ผ๋ฉด ๊ฒฐ๊ตญ imapct ๊ฐ ์ ํ์ ์ผ ์๋ฐ์ ์๊ธฐ ๋๋ฌธ์ด๋ค) ์์ฆ์ ์ ์ ๋ถ๋ค์ด ๋๋ค์ ์๋น์ค์ ๊ณผ๊ฑฐ ๋๋น ๋จ๋จ๋ฏธ์ง๊ทผํ ๋ฐ์์ ๋ณด์ด๋ ์๊ธฐ์ธ ๊ฒ ๊ฐ๊ธฐ๋ ํ๋ค.
๋ค๋ง, ์ ์ ๋ค์ ์ฌ์ ํ ๋ ๋์ ์ถ์ ์ด๊ณ ์ถ์ดํ๋ ์๊ตฌ๊ฐ ๊ฐํ๋ค. ๋ ์ฑ์ฅํ๊ณ ์ถ์ดํ๊ณ , ๋ ํ๋ณตํ ์ถ์ ์ด๊ณ ์ถ์ดํ๊ณ , ๋ ์ฆ๊ฑฐ์ด ์๊ฐ์ ๋ณด๋ด๊ณ ์ถ์ด ํ๋ค. ๋ณธ์ธ ์ธ์์ ๊ฐ๋ ฌํ๊ฒ ๋ค๊ฐ์ฌ ์ ์๋ ์๋น์ค๋ฅผ ๋ง๋, ํด๋น ์๋น์ค์ ํจ๊ป ๋ ๋ง์กฑ์ค๋ฌ์ด ์ถ์ ์ด๊ธฐ๋ฅผ ํฌ๋งํ๋ ๋ถ๋ค์ด ๊ฝค ์๋ค.
๊ทธ๋์, ๋ฏธ๋์๋ ์ด๋ค ์๋น์ค๊ฐ ๋ง์ ์ ์ ๋ก๋ถํฐ wow๋ฅผ ์ด๋์ด ๋ผ ์ ์์์ง... ๊ถ๊ธํ๋ค. ์ ์ ์ ์ฑ์ฅ๊ณผ ๋ง์กฑ์ ๋ ๋ง์ด ์ง์ฐฉํ๋ ์๋น์ค๊ฐ ๊ทธ wow ๋ฅผ ์ด๋์ด ๋ผ ์ ์์ผ๋ฆฌ๋ผ ๋ฏฟ์ง๋ง, ๊ณผ๊ฑฐ ๋๋น ์ ์ ์ wow ๋ฅผ ์ด๋์ด ๋ด๊ธฐ ์ํ ๋์ด๋๊ฐ ๋์์ ธ์ ๊ณผ๊ฑฐ ๋งํผ '๊ธ์ฑ์ฅํ๋ ์คํํธ์ '์ด ๋ง์ด ๋์ฌ ๊ฒ ๊ฐ์ง๋ ์๋ค. ๊ทธ๋๋, ์ด๋ฐ ๋์ธ๋ฅผ ์ด๊ฒจ๋ด๋ ๊ฐ์ฅ ์ข์ ๋ฐฉ๋ฒ์ '๋ ํ๋ค์ด์ง๋ ํ๊ฒฝ'์ ๋ฐ์ํ๊ธฐ ๋ณด๋ค๋ '์ ์ '์ ์ง์คํ์ฌ ํ๋ฃจ ํ๋ฃจ ๋ ์ข์ ์๋น์ค๋ฅผ ๋ง๋ค์ด ๋ด๋ ๊ฒ์ด๋ผ ์๊ฐํ๋ค. ๋ฌผ๋ก ๊ณผ๊ฑฐ์๋ 1๋ ๋ ธ๋ ฅํ๋ฉด ๋ง์กฑ์ ์ด๋์ด ๋ผ ์ ์์๋ค๋ฉด, ์ต๊ทผ์๋ 3~5๋ฐฐ๋ ๋ ํด์ผํ๋ ์ด๋ ค์์ด ์กด์ฌํ์ง๋ง ๋ง์ด๋ค.
๊ทธ๋์ ์์ฆ ๊ฐ์ ์๋๋ Why? ๊ฐ ๋ ์ค์ํด์ง๋ ๊ฒ ๊ฐ๋ค. ์ ์์ํ๋์ง? ๋ฌด์์ ์ํด ์๋น์ค๋ฅผ ๋ง๋ค์ด ๋ด๊ณ ์๋์ง?์ ๋ํ ๋ช ํํ๊ณ ์์งํ ๋ต์ ๊ฐ์ง๊ณ ์๋์ง๊ฐ ๋ ์ค์ํด์ง๋ ์๋๊ฐ ์๋ค. ๊ทธ ์ด์ ์ ๋ํ ๋ต์ ์ ์ฒด ํ์ด ๊ณต์ ํ๊ณ ์์ด์ผ, wow ๋ฅผ ๋ง๋ค์ด ๋ด๊ธฐ ์ํ ๋ ธ๋ ฅ/์๊ฐ์ด ์ฆ๊ฐํ ์๋๋ฅผ ๋ฌต๋ฌตํ ์ด๊ฒจ๋๊ฐ ์ ์๊ธฐ ๋๋ฌธ์ด๋ค.
์๋น์ค๋ฅผ ์ด์ด๋๊ฐ๊ณ ์๋ ๋ณธ์ง์ ์ด์ ๋ ๋ฌด์์ธ๊ฐ? ์ฐ๋ฆฌ ์๋น์ค๋ ๋๊ตฌ๋ฅผ ์ํด ์กด์ฌํ๋ฉฐ ์ ์กด์ฌํด์ผ ํ๋๊ฐ? ๊น์ ๋ฐค, ๋ง๊ธ์ ๋ ์ฑ์ฅ์์ผ ๋๊ฐ๊ธฐ ์ํ ๋ฐฉ๋ฒ์ ๊ณ ๋ฏผํ๋ค๊ฐ, ์๋ก ์ ์ธ ์ง๋ฌธ์ ๋ํด ๋ค์ ํ ๋ฒ ์๊ฐํด๋ณธ๋ค.
โค3๐1
โMachine learning costs, talent and chip shortagesโฆ any AI and machine learning company faces at least one of these challenges, and most face a few at a time,โ Pekhimenko told TechCrunch in an email interview. โThe highest-end chips are commonly unavailable due to the large demand from enterprises and startups alike. This leads to companies sacrificing on the size of the model they can deploy or results in higher inference latencies for their deployed models.โ
With spending on AI-focused chips expected to hit $53 billion this year and more than double in the next four years, according to Gartner, Pekhimenko felt the time was right to launch software that could make models run more efficiently on existing hardware.
โTraining AI and machine learning models is increasingly expensive,โ Pekhimenko said. โWith CentMLโs optimization technology, weโre able to reduce expenses up to 80% without compromising speed or accuracy.โ
โFor one of our customers, we optimized their Llama 2 model to work 3x faster by using Nvidia A10 GPU cards,โ
CentML isnโt the first to take a software-based approach to model optimization. It has competitors in MosaicML, which Databricks acquired in June for $1.3 billion, and OctoML, which landed an $85 million cash infusion in November 2021 for its machine learning acceleration platform.
โThe CentML platform can run any model,โ Pekhimenko said. โCentML produces optimized code for a variety of GPUs and reduces the memory needed to deploy models, and, as such, allows teams to deploy on smaller and cheaper GPUs.โ
With spending on AI-focused chips expected to hit $53 billion this year and more than double in the next four years, according to Gartner, Pekhimenko felt the time was right to launch software that could make models run more efficiently on existing hardware.
โTraining AI and machine learning models is increasingly expensive,โ Pekhimenko said. โWith CentMLโs optimization technology, weโre able to reduce expenses up to 80% without compromising speed or accuracy.โ
โFor one of our customers, we optimized their Llama 2 model to work 3x faster by using Nvidia A10 GPU cards,โ
CentML isnโt the first to take a software-based approach to model optimization. It has competitors in MosaicML, which Databricks acquired in June for $1.3 billion, and OctoML, which landed an $85 million cash infusion in November 2021 for its machine learning acceleration platform.
โThe CentML platform can run any model,โ Pekhimenko said. โCentML produces optimized code for a variety of GPUs and reduces the memory needed to deploy models, and, as such, allows teams to deploy on smaller and cheaper GPUs.โ
A new solution to the high-end chip shortage.
Read in @WSJ about how Together worked with large former crypto mining farms to repurpose their best GPUs and acquire new GPUs to train AI models โ all with a specialized training stack for a fraction of the price.
https://www.wsj.com/articles/crypto-miners-seek-a-new-life-in-ai-boom-after-an-implosion-in-mining-92a181fd
Read in @WSJ about how Together worked with large former crypto mining farms to repurpose their best GPUs and acquire new GPUs to train AI models โ all with a specialized training stack for a fraction of the price.
https://www.wsj.com/articles/crypto-miners-seek-a-new-life-in-ai-boom-after-an-implosion-in-mining-92a181fd
WSJ
Crypto Miners Seek a New Life in AI Boom After an Implosion in Mining
Demand for high-end chips allows cryptocurrency companies to repurpose idle equipment.
์ผ๋ง์ ๊ตฌ๊ธ ๋ธ๋ ์ธ, ๋ฅ๋ง์ธ๋์์ ์ฝ์ด ๋ฅ๋ฌ๋ ํ์ ์์๋ ์น๊ตฌ์ ์ด์ผ๊ธฐ ๋๋๋ฉด์ ์ธ์์ถ์๋ ๋ถ๋ถ๋ค (์ด์ธ์ข
๋)
์ง๋๋ฒ ์ธ์ ์์ ์ธ์๊น์ ๋ถ๋ถ์ ๋๋ค.
quantization ๋ฑ ๋ชจ๋ธ ์ต์ ํ๋ ๋ชจ๋ธ ์ํคํ ์ฒ ํจ์จํ๋ก ๋ชจ๋ธ ํ์ต/์ธํผ๋ฐ์ค ์ปดํจํธ(GPU) ๋น์ฉ์ด ์ผ์์ ์ผ๋ก ๊ฐ์ํ ์ ์์ง๋ง, ์ฅ๊ธฐ์ ๊ด์ ์์ ์ปดํจํธ์ ์ ๋์ ์์๋ ๋น ๋ฅด๊ฒ ์ฆ๊ฐํ ๊ฒ
- ๋ฅ๋ง์ธ๋์์ ์ํํด๋๋ฅผ 3๋ช ์ ์ฐ๊ตฌ์์ด ํด๋ผ ์ ์์๋ ๊ฒ์ ์ด๋ค์ด ๋ ๋๋ํด์๊ฐ ์๋๋ผ 1์ธ๋น ํ์ฉ ๊ฐ๋ฅํ ์ปดํจํธ๊ฐ ๋ค๋ฅธ ๊ธฐ์ , ์ฐ๊ตฌ์ ๋ณด๋ค ์๋์ ์ผ๋ก ๋ง์๊ธฐ ๋๋ฌธ
- ์ปดํจํธ๊ฐ ๋ณดํธํ๋์ด ๊ฐ๊ฒฉ์ด ์ ๋ ดํด์ง ์๋ก ์ํํด๋ ๊ธ์ ํ์ ์ด ๋ชจ๋ ์ฐ์ ๊ณผ ์์ญ์์ ํผ์ณ์ง๊ฒ
- ๋์๊ฐ ์ฐ๋ฆฌ๋ ์ปดํจํธ์ ํ๊ณ๋ก ๊ธฐ๋ณธ์ ์ธ ํ ์คํธ ๋ฐ์ดํฐ ํ๋ก์ธ์ฑ์ ๋ฐ์ด ๋ฌถ์ฌ ์์๋๋ฐ ์์ผ๋ก ์์ ๋ฑ ๋ณต์ก๋ ๋๊ณ ๋ฌด๊ฑฐ์ด ๋ฐ์ดํฐ ์ฒ๋ฆฌ ์์๊ฐ ๊ธฐํ๊ธ์์ ์ผ๋ก ๋์ด๋ ๊ฒ
์ด๋ฏธ ํค์งํ๋์์ ์ฃผ๊ฐ ์์ธก์ ํนํ๋ ๋ชจ๋ธ์ ์ ๊ทน ํ์ฉํด ๋ง๋ํ ์์ต ์ฐฝ์ถ์ค
- ๋ชจ๋ธ ํ์ต์ ์ผ๋ง ๋ค๊ณ , ๋ชจ๋ธ ๊ฒฝ์์ฐ์์ ์ง์์ฑ๊ณผ ๊ธฐ๋ ๋งค์ถ์ ๊ณ ๋ คํ์ ๋ ROI๊ฐ ์ด๋ป๊ฒ ๋ ์ง๋ฅผ ๊ณ์ฐํด์ ๋์ ์ฌ๋ฌ๊ฐ ๋ชจ๋ธ ํ์ต์ค
- ๊ทธ์ธ ๋ชจ๋ ๊ธฐ์ ์ ๋ ๊ฑฐ์ ์ฝ๋์ ๋ฐ์ดํฐ๊ฐ ์๋๋ฐ ๋ณด๊ด๋ ๋ฐฉ์๊ณผ ์ฝ๋์ ํํ ๋๋ฌธ์ ๋งค๋ ์ด๋งํ ๊ณ ์ ๋น์ฉ์ด ๋ฐ์. AI ๋ชจ๋ธ ๋์์ ๋ฐ์ ์ฝ๋ ์ ๋ฐ์ดํธ์ ๋ง์ด๊ทธ๋ ์ด์ ์ ํตํด ํ์ฌ ์ ๋ฐ์ ์์ต์ฑ์ ๊ฐ์ ํ๋ ๊ฒฝ์ฐ๋ ์์ฃผ ๋ณด์
"์ฌ๋์ ๋๋ ค๋ฐ์" ์ค์ผ์ผ์ ์ถ๊ตฌํ๋ ๋ธ๋ฆฌ์ธ ์ค์ผ์ผ๋ง ๋ฐฉ๋ฒ์ AI ์๋์์ ์ ํจํ์ง ์์ ์๋
- ๊ธฐ์ ์์ ์ ํต์ ์ผ๋ก ์ธ๊ฑด๋น๊ฐ ๋์ด ๊ฐ์ฅ ๋ง์ด ๋ค์ด๊ฐ๋๋ฐ ์ต๊ทผ์ ๊ตฌ๊ธ์์ ์ปดํจํธ ๋น์ฉ์ด ๊ฐ๋ฐ์ ๋น์ฉ์ ์์
- ์คํAI๋ 400๋ช ๋๋ ๊ธฐ์ ์ธ๋ฐ ์ปดํจํธ ๋น์ฉ์ด ์ธ๋น ๊ฐ๋ฐ์ ๋น์ฉ์ 4๋ฐฐ
- ์์ผ๋ก๋ ์์ง๋ง ํ๋ก๋ํธ๋ถํฐ ์ธ์ผ์ฆ๊น์ง ์ ์ฌ ์คํผ๋ ์ด์ ์ ์ผ๋ผ์ธ๋ ๋จ๋จํ ํ์ด ๊ฒฝ์๋ ฅ ์์ง ์์์ง
ํฌ๊ณ ์์ ์ฌ๋ฌ ์ ํ์ ์ ๋๋ด๋ฆฌ์คํธ์ ํนํ๋ SOTA ๋ชจ๋ธ์ด ๊ณต์กดํ๊ฒ๋ ๊ฒ
- ์ ์ ๊ฐ ์ํ๋ ์์ฒญ์ ๋ง๋ ๋ชจ๋ธ์ด ์๋ ์ถ์ฒ๋์ด ํ์คํฌ๊ฐ ์ฒ๋ฆฌ๋๋ Model of Experts ํํ
- ๋ชจ๋ธ์ ํฌ๊ฒ ๋ ์ถ์ ๊ต์งํฉ์ผ๋ก ์กด์ฌํ ๊ฑด๋ฐ ํ ์ถ์ intelligence (์: 80 IQ ~ 150 IQ), ๊ทธ๋ฆฌ๊ณ ๋ค๋ฅธ ์ถ์ ๋ฒํฐ์ปฌ (์: ์ฝ๋ฉ, ๋ฒ๋ฅ , ์๋ฃ ๋ฑ)
- ๋ชจ๋ธ ๊ฒ์/์ถ์ฒ/์ฐ๊ฒฐ์ ๋๋orchestration ๋ ์ด์ด๊ฐ ํต์ฌ ๊ธฐ์ ๋ก ๋ถ์ํ ๊ฒ
์ถ์ฒ: ์ด์ธ์ข ๋ ํ์ด์ค๋ถ
https://www.facebook.com/Saejlee
์ง๋๋ฒ ์ธ์ ์์ ์ธ์๊น์ ๋ถ๋ถ์ ๋๋ค.
quantization ๋ฑ ๋ชจ๋ธ ์ต์ ํ๋ ๋ชจ๋ธ ์ํคํ ์ฒ ํจ์จํ๋ก ๋ชจ๋ธ ํ์ต/์ธํผ๋ฐ์ค ์ปดํจํธ(GPU) ๋น์ฉ์ด ์ผ์์ ์ผ๋ก ๊ฐ์ํ ์ ์์ง๋ง, ์ฅ๊ธฐ์ ๊ด์ ์์ ์ปดํจํธ์ ์ ๋์ ์์๋ ๋น ๋ฅด๊ฒ ์ฆ๊ฐํ ๊ฒ
- ๋ฅ๋ง์ธ๋์์ ์ํํด๋๋ฅผ 3๋ช ์ ์ฐ๊ตฌ์์ด ํด๋ผ ์ ์์๋ ๊ฒ์ ์ด๋ค์ด ๋ ๋๋ํด์๊ฐ ์๋๋ผ 1์ธ๋น ํ์ฉ ๊ฐ๋ฅํ ์ปดํจํธ๊ฐ ๋ค๋ฅธ ๊ธฐ์ , ์ฐ๊ตฌ์ ๋ณด๋ค ์๋์ ์ผ๋ก ๋ง์๊ธฐ ๋๋ฌธ
- ์ปดํจํธ๊ฐ ๋ณดํธํ๋์ด ๊ฐ๊ฒฉ์ด ์ ๋ ดํด์ง ์๋ก ์ํํด๋ ๊ธ์ ํ์ ์ด ๋ชจ๋ ์ฐ์ ๊ณผ ์์ญ์์ ํผ์ณ์ง๊ฒ
- ๋์๊ฐ ์ฐ๋ฆฌ๋ ์ปดํจํธ์ ํ๊ณ๋ก ๊ธฐ๋ณธ์ ์ธ ํ ์คํธ ๋ฐ์ดํฐ ํ๋ก์ธ์ฑ์ ๋ฐ์ด ๋ฌถ์ฌ ์์๋๋ฐ ์์ผ๋ก ์์ ๋ฑ ๋ณต์ก๋ ๋๊ณ ๋ฌด๊ฑฐ์ด ๋ฐ์ดํฐ ์ฒ๋ฆฌ ์์๊ฐ ๊ธฐํ๊ธ์์ ์ผ๋ก ๋์ด๋ ๊ฒ
์ด๋ฏธ ํค์งํ๋์์ ์ฃผ๊ฐ ์์ธก์ ํนํ๋ ๋ชจ๋ธ์ ์ ๊ทน ํ์ฉํด ๋ง๋ํ ์์ต ์ฐฝ์ถ์ค
- ๋ชจ๋ธ ํ์ต์ ์ผ๋ง ๋ค๊ณ , ๋ชจ๋ธ ๊ฒฝ์์ฐ์์ ์ง์์ฑ๊ณผ ๊ธฐ๋ ๋งค์ถ์ ๊ณ ๋ คํ์ ๋ ROI๊ฐ ์ด๋ป๊ฒ ๋ ์ง๋ฅผ ๊ณ์ฐํด์ ๋์ ์ฌ๋ฌ๊ฐ ๋ชจ๋ธ ํ์ต์ค
- ๊ทธ์ธ ๋ชจ๋ ๊ธฐ์ ์ ๋ ๊ฑฐ์ ์ฝ๋์ ๋ฐ์ดํฐ๊ฐ ์๋๋ฐ ๋ณด๊ด๋ ๋ฐฉ์๊ณผ ์ฝ๋์ ํํ ๋๋ฌธ์ ๋งค๋ ์ด๋งํ ๊ณ ์ ๋น์ฉ์ด ๋ฐ์. AI ๋ชจ๋ธ ๋์์ ๋ฐ์ ์ฝ๋ ์ ๋ฐ์ดํธ์ ๋ง์ด๊ทธ๋ ์ด์ ์ ํตํด ํ์ฌ ์ ๋ฐ์ ์์ต์ฑ์ ๊ฐ์ ํ๋ ๊ฒฝ์ฐ๋ ์์ฃผ ๋ณด์
"์ฌ๋์ ๋๋ ค๋ฐ์" ์ค์ผ์ผ์ ์ถ๊ตฌํ๋ ๋ธ๋ฆฌ์ธ ์ค์ผ์ผ๋ง ๋ฐฉ๋ฒ์ AI ์๋์์ ์ ํจํ์ง ์์ ์๋
- ๊ธฐ์ ์์ ์ ํต์ ์ผ๋ก ์ธ๊ฑด๋น๊ฐ ๋์ด ๊ฐ์ฅ ๋ง์ด ๋ค์ด๊ฐ๋๋ฐ ์ต๊ทผ์ ๊ตฌ๊ธ์์ ์ปดํจํธ ๋น์ฉ์ด ๊ฐ๋ฐ์ ๋น์ฉ์ ์์
- ์คํAI๋ 400๋ช ๋๋ ๊ธฐ์ ์ธ๋ฐ ์ปดํจํธ ๋น์ฉ์ด ์ธ๋น ๊ฐ๋ฐ์ ๋น์ฉ์ 4๋ฐฐ
- ์์ผ๋ก๋ ์์ง๋ง ํ๋ก๋ํธ๋ถํฐ ์ธ์ผ์ฆ๊น์ง ์ ์ฌ ์คํผ๋ ์ด์ ์ ์ผ๋ผ์ธ๋ ๋จ๋จํ ํ์ด ๊ฒฝ์๋ ฅ ์์ง ์์์ง
ํฌ๊ณ ์์ ์ฌ๋ฌ ์ ํ์ ์ ๋๋ด๋ฆฌ์คํธ์ ํนํ๋ SOTA ๋ชจ๋ธ์ด ๊ณต์กดํ๊ฒ๋ ๊ฒ
- ์ ์ ๊ฐ ์ํ๋ ์์ฒญ์ ๋ง๋ ๋ชจ๋ธ์ด ์๋ ์ถ์ฒ๋์ด ํ์คํฌ๊ฐ ์ฒ๋ฆฌ๋๋ Model of Experts ํํ
- ๋ชจ๋ธ์ ํฌ๊ฒ ๋ ์ถ์ ๊ต์งํฉ์ผ๋ก ์กด์ฌํ ๊ฑด๋ฐ ํ ์ถ์ intelligence (์: 80 IQ ~ 150 IQ), ๊ทธ๋ฆฌ๊ณ ๋ค๋ฅธ ์ถ์ ๋ฒํฐ์ปฌ (์: ์ฝ๋ฉ, ๋ฒ๋ฅ , ์๋ฃ ๋ฑ)
- ๋ชจ๋ธ ๊ฒ์/์ถ์ฒ/์ฐ๊ฒฐ์ ๋๋orchestration ๋ ์ด์ด๊ฐ ํต์ฌ ๊ธฐ์ ๋ก ๋ถ์ํ ๊ฒ
์ถ์ฒ: ์ด์ธ์ข ๋ ํ์ด์ค๋ถ
https://www.facebook.com/Saejlee
Facebook
Log in or sign up to view
See posts, photos and more on Facebook.
๐2
Continuous Learning_Startup & Investment
A new solution to the high-end chip shortage. Read in @WSJ about how Together worked with large former crypto mining farms to repurpose their best GPUs and acquire new GPUs to train AI models โ all with a specialized training stack for a fraction of the price.โฆ
Today, Satoshi Spainโs machines are handling AI workloads for startups, universities and individual developers in Europe.
โYou can still make money from your mining rig,โ Satoshi Spain founder Alejandro Ibรกรฑez de Pedro said. โItโs mining 2.0.โ
Around 20% of the freed-up chips can be repurposed to train AI models, said Vipul Ved Prakash, the CEO of alternative cloud provider Together.
Together is using former crypto miners for GPUs and other hardware to build virtual server farms. It has leased thousands of GPUs to help power Togetherโs cloud service for AI developers. In May, it raised $20 million in seed funding.
Demi Guo, the founder of an AI applications startup called Mellis AI, couldnโt get the data processing pricing or capacity she needed at the leading cloud providers so she is using repurposed hardware available through Togetherโs cloud.
Saurabh Vij has built an AI business around chips formerly used for mining. His startup, Monster, has access to more than 30,000 chipsโmost of them once used for miningโthat its customers can lease.
โYou can still make money from your mining rig,โ Satoshi Spain founder Alejandro Ibรกรฑez de Pedro said. โItโs mining 2.0.โ
Around 20% of the freed-up chips can be repurposed to train AI models, said Vipul Ved Prakash, the CEO of alternative cloud provider Together.
Together is using former crypto miners for GPUs and other hardware to build virtual server farms. It has leased thousands of GPUs to help power Togetherโs cloud service for AI developers. In May, it raised $20 million in seed funding.
Demi Guo, the founder of an AI applications startup called Mellis AI, couldnโt get the data processing pricing or capacity she needed at the leading cloud providers so she is using repurposed hardware available through Togetherโs cloud.
Saurabh Vij has built an AI business around chips formerly used for mining. His startup, Monster, has access to more than 30,000 chipsโmost of them once used for miningโthat its customers can lease.
Here's some AI โmathโ from today's baseline:
- Chips improve by 2x next year
- Models improve by 2x next year
- Capex scales capacity by 10x next year
This means that whatever you see today will be ~40x better in a year and another ~20-40x better a year after that.
So we can expect an ~800-1,600x improvement by 2026.
What *wonโt* be possible is probably the better question.
- Chips improve by 2x next year
- Models improve by 2x next year
- Capex scales capacity by 10x next year
This means that whatever you see today will be ~40x better in a year and another ~20-40x better a year after that.
So we can expect an ~800-1,600x improvement by 2026.
What *wonโt* be possible is probably the better question.
โค1
Continuous Learning_Startup & Investment
Here's some AI โmathโ from today's baseline: - Chips improve by 2x next year - Models improve by 2x next year - Capex scales capacity by 10x next year This means that whatever you see today will be ~40x better in a year and another ~20-40x better a yearโฆ
ChatGPT has some push back for you (see below). Its quite good!
The scenario you're describing is an incredibly optimistic one that assumes exponential growth across multiple dimensions of computing technology: chip performance, AI model capabilities, and capital expenditure for capacity scaling. In reality, while rapid advancements are certainly occurring, there are a number of physical, economic, and social constraints that could inhibit such a straightforward, exponential trajectory.
Hereโs a breakdown of challenges and limitations in each of the areas you've mentioned:
1. Chip Performance: The idea that chip performance would continue to double every year is a significant acceleration of Moore's Law, which historically observed that the number of transistors on a microchip doubles about every two years. However, we're reaching physical limitations of silicon technology, with features approaching the scale of a few nanometers. Quantum tunneling, heat dissipation, and energy efficiency are all significant challenges. While new materials and paradigms (like quantum computing or 3D stacking) are in development, they also come with their own sets of challenges that might not allow for a simple 2x improvement year over year.
2. Model Improvements: AI models are indeed improving rapidly, but there's more to consider than raw performance. Larger models require more data, more power, and better algorithms to be effective. Improvements in algorithms could potentially hit a point of diminishing returns, and there are concerns about the environmental impact of training extremely large models. Additionally, large models can become increasingly difficult to manage, debug, and ensure they do not perpetuate biases or make erroneous decisions.
3. Capital Expenditure: A 10x scale in capital expenditure assumes not only that companies have the economic capacity and willingness to invest at such levels but also that the supply chains and raw materials necessary for such a scale-up are available. This also presumes that there will be sufficient market demand to justify this level of investment.
Now, letโs address your main question: what won't be possible, even with such improvements? Here are some potential limitations:
- Physical Laws: Computing is still bound by the laws of physics. Even if computers become vastly more powerful, there will still be things that are simply impossible, like violating the speed of light or the second law of thermodynamics.
- Complexity and Emergent Phenomena: Many systems, like weather, human brains, or economies, are incredibly complex. Even with vastly improved models, the sheer number of variables and interactions can make prediction and understanding difficult.
- Social and Ethical Constraints: Just because something can be done technically does not mean it should be. Ethical considerations regarding surveillance, privacy, and the use of AI in decision-making will likely place limits on how technology is applied.
- Economic Considerations: Not every advancement will be economically feasible or provide a return on investment. Additionally, there could be resistance from industries that are disrupted by these advancements, potentially slowing adoption.
- Human Adaptability: The rate at which society can adapt to new technologies is limited. There are educational, cultural, and institutional barriers that often delay the implementation of even the most promising technologies.
In summary, while your scenario posits a world in which the capabilities of computing technology grow at an unprecedented rate, it's important to consider the multifaceted challenges that accompany such growth. The future is likely to be one of incredible technological advancement, but the path there will probably be more uneven and complex than the simple multiplication of current capabilities.
The scenario you're describing is an incredibly optimistic one that assumes exponential growth across multiple dimensions of computing technology: chip performance, AI model capabilities, and capital expenditure for capacity scaling. In reality, while rapid advancements are certainly occurring, there are a number of physical, economic, and social constraints that could inhibit such a straightforward, exponential trajectory.
Hereโs a breakdown of challenges and limitations in each of the areas you've mentioned:
1. Chip Performance: The idea that chip performance would continue to double every year is a significant acceleration of Moore's Law, which historically observed that the number of transistors on a microchip doubles about every two years. However, we're reaching physical limitations of silicon technology, with features approaching the scale of a few nanometers. Quantum tunneling, heat dissipation, and energy efficiency are all significant challenges. While new materials and paradigms (like quantum computing or 3D stacking) are in development, they also come with their own sets of challenges that might not allow for a simple 2x improvement year over year.
2. Model Improvements: AI models are indeed improving rapidly, but there's more to consider than raw performance. Larger models require more data, more power, and better algorithms to be effective. Improvements in algorithms could potentially hit a point of diminishing returns, and there are concerns about the environmental impact of training extremely large models. Additionally, large models can become increasingly difficult to manage, debug, and ensure they do not perpetuate biases or make erroneous decisions.
3. Capital Expenditure: A 10x scale in capital expenditure assumes not only that companies have the economic capacity and willingness to invest at such levels but also that the supply chains and raw materials necessary for such a scale-up are available. This also presumes that there will be sufficient market demand to justify this level of investment.
Now, letโs address your main question: what won't be possible, even with such improvements? Here are some potential limitations:
- Physical Laws: Computing is still bound by the laws of physics. Even if computers become vastly more powerful, there will still be things that are simply impossible, like violating the speed of light or the second law of thermodynamics.
- Complexity and Emergent Phenomena: Many systems, like weather, human brains, or economies, are incredibly complex. Even with vastly improved models, the sheer number of variables and interactions can make prediction and understanding difficult.
- Social and Ethical Constraints: Just because something can be done technically does not mean it should be. Ethical considerations regarding surveillance, privacy, and the use of AI in decision-making will likely place limits on how technology is applied.
- Economic Considerations: Not every advancement will be economically feasible or provide a return on investment. Additionally, there could be resistance from industries that are disrupted by these advancements, potentially slowing adoption.
- Human Adaptability: The rate at which society can adapt to new technologies is limited. There are educational, cultural, and institutional barriers that often delay the implementation of even the most promising technologies.
In summary, while your scenario posits a world in which the capabilities of computing technology grow at an unprecedented rate, it's important to consider the multifaceted challenges that accompany such growth. The future is likely to be one of incredible technological advancement, but the path there will probably be more uneven and complex than the simple multiplication of current capabilities.
https://www.theinformation.com/articles/metas-free-ai-isnt-cheap-to-use-companies-say?rc=3jxaoo (์ ๋ฃ๊ธฐ์ฌ์
๋๋ค)
์คํ์์ค ๋ชจ๋ธ์ด ์์ ๋ชจ๋ธ๋ณด๋ค ์ ๋ ดํ์ง ์์ ๊ฒฝ์ฐ๊ฐ ํจ์ฌ ๋ง๋ค๋ ๊ธฐ์ฌ์ ๋๋ค. ๋ด์ค์ ์ฐธ์กฐ๋ ์๋์ ๊ทธ๋ํ ๋ณด์๋ฉด ์ฌ์ง์ด 7B์ง๋ฆฌ LLaMA ๋ชจ๋ธ ๊ตฌ๋์ด GPT3.5 Turbo๋ณด๋ค ๋น์ธ๋ค์.
์ต๊ทผ LLM์ ๊ฒฝ๋ํ/์ต์ ํ ์ถ์ธ๋ฅผ ๋์ฌ๊ฒจ ๋ณด์๋ ๋ถ๋ค์ด๋ผ๋ฉด ์ด์ฉ๋ฉด ๋น์ฐํ ์๋ ์๋ ๊ฒ์ด, ๊ต์ฅํ ๋ง์ traffic์ด ๋ชฐ๋ฆด ์๋ก (๋๋งค ์ฅ์ฌ์ฒ๋ผ) cost๋ฅผ ๋ฎ์ถ ์ ์๋ ๋ฐฉ๋ฒ๋ค์ด ์กด์ฌํ๋๋ฐ, ์ปค์คํฐ๋ง์ด์ง ๋ ์คํ์์ค ๋ชจ๋ธ์ ๊ตฌ๋ํ ์๋ก ์๋ฒ ๋น์ฉ์ด ๋ ๋น์ธ์ง๊ณ , ์ฐจ๋ผ๋ฆฌ ๋ฐฑ๋ณธ ๋ชจ๋ธ์์ tuning ์๋น์ค๋ฅผ ๊ตฌ๋ํ๋ ๊ฒ์ด (์๋ฅผ ๋ค๋ฉด LoRA weight๋ค๋ ๋์ผ ๊ท๊ฒฉ์ด๋ฉด ํ๊บผ๋ฒ์ ๋ฌถ์ด๋ฒ๋ ค์ ์๋น์ค๋ฅผ ์ฒ๋ฆฌํ๋ค๋๊ฐ) ๊ฐ๊ฒฉ์ ๋ ์ ๋ ดํ ์ ๋ฐ์ ์์ต๋๋ค.
LLM์ด ํด๋ผ์ฐ๋ ์๋น์ค์ ๋๋์ฑ ๋ฐ์ ํ๊ฒ ์ฐ๊ฒฐ๋ ์ ๋ฐ์ ์๋ค๋ณด๋ ์ด๋ฌํ ๊ฒฝํฅ์ด ์๊ธธ ์ ๋ฐ์ ์๋ ๊ฒ์ด๊ธฐ๋ ํ๋ฐ์, ๋ค์ด๋ฒํด๋ผ์ฐ๋๊ฐ AI๋ฅผ ๋ฆฌ๋ํ๋ ๊ฒ์ ์ด๋ฌํ ์ถ์ธ์ ์ ํฉํ ์ ํ์ด๊ธฐ๋ ํ์ต๋๋ค.
๋ค์ด๋ฒํด๋ผ์ฐ๋๋ OpenAI๋ ๊ตฌ๊ธ, ๋ง์ดํฌ๋ก์ํํธ์ cost ๊ฒฝ์์์ ์ง์ง ์์ ์ฌ๋ฌ๊ฐ์ง ์ต์ ํ/๊ฒฝ๋ํ ๊ธฐ์ ๋ค์ ๊ตฌ๋ํด ๋์๊ณ , ์ค์ฅ๊ธฐ์์๋ ์ด๋ ๋น ํ ํฌ๋ณด๋ค ๋ฐ์ด๋ ์๋น์ค ๊ฒฝ์๋ ฅ์ ๊ฐ์ถ AI ๋ฐ๋์ฒด๋ฅผ ์ ์งํํ๊ณ ์์ต๋๋ค.
์คํ์์ค ๋ชจ๋ธ์ด ์์ ๋ชจ๋ธ๋ณด๋ค ์ ๋ ดํ์ง ์์ ๊ฒฝ์ฐ๊ฐ ํจ์ฌ ๋ง๋ค๋ ๊ธฐ์ฌ์ ๋๋ค. ๋ด์ค์ ์ฐธ์กฐ๋ ์๋์ ๊ทธ๋ํ ๋ณด์๋ฉด ์ฌ์ง์ด 7B์ง๋ฆฌ LLaMA ๋ชจ๋ธ ๊ตฌ๋์ด GPT3.5 Turbo๋ณด๋ค ๋น์ธ๋ค์.
์ต๊ทผ LLM์ ๊ฒฝ๋ํ/์ต์ ํ ์ถ์ธ๋ฅผ ๋์ฌ๊ฒจ ๋ณด์๋ ๋ถ๋ค์ด๋ผ๋ฉด ์ด์ฉ๋ฉด ๋น์ฐํ ์๋ ์๋ ๊ฒ์ด, ๊ต์ฅํ ๋ง์ traffic์ด ๋ชฐ๋ฆด ์๋ก (๋๋งค ์ฅ์ฌ์ฒ๋ผ) cost๋ฅผ ๋ฎ์ถ ์ ์๋ ๋ฐฉ๋ฒ๋ค์ด ์กด์ฌํ๋๋ฐ, ์ปค์คํฐ๋ง์ด์ง ๋ ์คํ์์ค ๋ชจ๋ธ์ ๊ตฌ๋ํ ์๋ก ์๋ฒ ๋น์ฉ์ด ๋ ๋น์ธ์ง๊ณ , ์ฐจ๋ผ๋ฆฌ ๋ฐฑ๋ณธ ๋ชจ๋ธ์์ tuning ์๋น์ค๋ฅผ ๊ตฌ๋ํ๋ ๊ฒ์ด (์๋ฅผ ๋ค๋ฉด LoRA weight๋ค๋ ๋์ผ ๊ท๊ฒฉ์ด๋ฉด ํ๊บผ๋ฒ์ ๋ฌถ์ด๋ฒ๋ ค์ ์๋น์ค๋ฅผ ์ฒ๋ฆฌํ๋ค๋๊ฐ) ๊ฐ๊ฒฉ์ ๋ ์ ๋ ดํ ์ ๋ฐ์ ์์ต๋๋ค.
LLM์ด ํด๋ผ์ฐ๋ ์๋น์ค์ ๋๋์ฑ ๋ฐ์ ํ๊ฒ ์ฐ๊ฒฐ๋ ์ ๋ฐ์ ์๋ค๋ณด๋ ์ด๋ฌํ ๊ฒฝํฅ์ด ์๊ธธ ์ ๋ฐ์ ์๋ ๊ฒ์ด๊ธฐ๋ ํ๋ฐ์, ๋ค์ด๋ฒํด๋ผ์ฐ๋๊ฐ AI๋ฅผ ๋ฆฌ๋ํ๋ ๊ฒ์ ์ด๋ฌํ ์ถ์ธ์ ์ ํฉํ ์ ํ์ด๊ธฐ๋ ํ์ต๋๋ค.
๋ค์ด๋ฒํด๋ผ์ฐ๋๋ OpenAI๋ ๊ตฌ๊ธ, ๋ง์ดํฌ๋ก์ํํธ์ cost ๊ฒฝ์์์ ์ง์ง ์์ ์ฌ๋ฌ๊ฐ์ง ์ต์ ํ/๊ฒฝ๋ํ ๊ธฐ์ ๋ค์ ๊ตฌ๋ํด ๋์๊ณ , ์ค์ฅ๊ธฐ์์๋ ์ด๋ ๋น ํ ํฌ๋ณด๋ค ๋ฐ์ด๋ ์๋น์ค ๊ฒฝ์๋ ฅ์ ๊ฐ์ถ AI ๋ฐ๋์ฒด๋ฅผ ์ ์งํํ๊ณ ์์ต๋๋ค.
The Information
Metaโs Free AI Isnโt Cheap to Use, Companies Say
Some companies that pay for OpenAIโs artificial intelligence have been looking to cut costs with free, open-source alternatives. But these AI customers are realizing that oftentimes open-source tech can actually be more expensive than buying from OpenAI.โฆ
Continuous Learning_Startup & Investment
https://gpus.llm-utils.org/
์ฌ๋ฌ ์๋น์ค๋ฅผ ๊ฝค ์ ๋น๊ตํด๋จ๋ค์ ใ
ใ
Which GPU cloud should I use?
#
If you need a huge number of A100s/H100s - talk to Oracle, FluidStack, Lambda Labs, maybe a few others. Capacity is very low though for large quantities, especially of H100s, based on a couple of cloud founders/execs Iโve talked with.
If you need a couple A100s or H100s: Runpod, perhaps Tensordock or Latitude.
If you need 1x H100: Runpod (Fluidstack and Lambda have been out of on-demand capacity for qhite a while).
If you need cheap 3090s, 4090s, or A6000s: Tensordock.
If you need Stable Diffusion inference only: Salad.
If you need a wide variety of GPUs: Runpod or Tensordock.
If you want to play around with templates / general hobbyist: Runpod.
The large clouds generally have worse pricing and more complicated setups than the above.
If youโre tied to one of the big clouds (AWS, Azure, GCP), then you donโt have a choice, so use that.
More info here, here, here, here, here and here.
Which GPU cloud should I use?
#
If you need a huge number of A100s/H100s - talk to Oracle, FluidStack, Lambda Labs, maybe a few others. Capacity is very low though for large quantities, especially of H100s, based on a couple of cloud founders/execs Iโve talked with.
If you need a couple A100s or H100s: Runpod, perhaps Tensordock or Latitude.
If you need 1x H100: Runpod (Fluidstack and Lambda have been out of on-demand capacity for qhite a while).
If you need cheap 3090s, 4090s, or A6000s: Tensordock.
If you need Stable Diffusion inference only: Salad.
If you need a wide variety of GPUs: Runpod or Tensordock.
If you want to play around with templates / general hobbyist: Runpod.
The large clouds generally have worse pricing and more complicated setups than the above.
If youโre tied to one of the big clouds (AWS, Azure, GCP), then you donโt have a choice, so use that.
More info here, here, here, here, here and here.
Continuous Learning_Startup & Investment
https://gpus.llm-utils.org/nvidia-h100-gpus-supply-and-demand/
How Do Nvidia Allocations Work?
#
They have an allocation they give per customer. But for example, Azure saying โhey we would like 10,000 H100s all to be used by Inflectionโ is different from Azure saying โhey we would like 10,000 H100s for Azureโs cloudโ - Nvidia cares about who the end customer is, and so clouds might be able to get an extra allocation for a specific end customer if Nvidia is excited about the end customer. Nvidia also wants to know who that end customer is, as much as possible. And they prefer customers with nice brand names or startups with strong pedigrees.
Yes, this seems to be the case. NVIDIA likes to guarantee GPU access to rising AI companies (many of which they have a close relationship with). See Inflection โ an AI company they invested in โ testing a huge H100 cluster on CoreWeave, which they also invested in
โ Private cloud exec
Itโs a unique situation in that Nvidia is giving large allocations to private clouds: CoreWeave has more H100s than GCP.
Nvidia would prefer not to give large allocations to companies that are attempting to compete directly with them (AWS Inferentia and Tranium, Google TPUs, Azure Project Athena).
#
They have an allocation they give per customer. But for example, Azure saying โhey we would like 10,000 H100s all to be used by Inflectionโ is different from Azure saying โhey we would like 10,000 H100s for Azureโs cloudโ - Nvidia cares about who the end customer is, and so clouds might be able to get an extra allocation for a specific end customer if Nvidia is excited about the end customer. Nvidia also wants to know who that end customer is, as much as possible. And they prefer customers with nice brand names or startups with strong pedigrees.
Yes, this seems to be the case. NVIDIA likes to guarantee GPU access to rising AI companies (many of which they have a close relationship with). See Inflection โ an AI company they invested in โ testing a huge H100 cluster on CoreWeave, which they also invested in
โ Private cloud exec
Itโs a unique situation in that Nvidia is giving large allocations to private clouds: CoreWeave has more H100s than GCP.
Nvidia would prefer not to give large allocations to companies that are attempting to compete directly with them (AWS Inferentia and Tranium, Google TPUs, Azure Project Athena).