MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU

204 points | by chrsw 7 hours ago

41 comments