Programming
755 subscribers
4 photos
12.6K links
Discussion and news about —
Computer Programming.
Download Telegram
"We modified llama.cpp to load weights using mmap() instead of C++ standard I/O. That enabled us to load LLaMA 100x faster using half as much memory."
https://justine.lol/mmap/

https://redd.it/12ek8dw
@programmingreddit