跳转至

结论与延伸阅读

Conclusions and Further Reading

📋 章节概览

所属部分:结论与扩展 原文标题:Conclusions and Further Reading 原文地址:https://jax-ml.github.io/scaling-book/conclusion 翻译时间:2026年03月30日

🎯 本章要点

本章将深入探讨结论与延伸阅读的相关内容,包括:

  1. 核心概念:理解结论与延伸阅读的基本原理
  2. 技术实现:掌握相关的技术实现方法
  3. 实践应用:了解在实际项目中的应用场景
  4. 优化策略:学习性能优化和最佳实践


翻译状态:初步翻译

技术说明: 1. 专业术语已进行基础翻译 2. 复杂概念保留英文原文 3. 公式和代码保持原样 4. 需要进一步的人工校对和完善

学习建议: - 结合原英文文档理解复杂概念 - 参考相关技术文档加深理解 - 实践教材中的代码示例


Conclusions and Further Reading | How To Scale Your 模型 * How To Scale Your 模型 Toggle navigation
* Previous Part * Next Part * Sections Part 0. Introduction Part 1. Intro to Rooflines Part 2. All About TPUs Part 3. Sharded Matmuls Part 4. Transformers Part 5. 训练 Part 6. 训练 LLaMA Part 7. 推理 Part 8. Serving LLaMA Part 9. Profiling Part 10. All About JAX Part 11. Conclusions Part 12. GPUs
*

        # Conclusions and Further Reading Part 11 of [How To Scale Your 模型](/scaling-book) ([Part 10: JAX](../jax-stuff) | [Part 12: GPUs](../gpus))

Thank you for reading! Here we'll include a few more references for further study.

  ### Contents  [Acknowledgments](#acknowledgments)   [Further Reading](#further-reading)   [Feedback](#feedback)    Thank you for reading the whole thing and congratulations on making it all the way to the end. Before we conclude, a few acknowledgments:

## Acknowledgments This document represents a significant collective investment from many people at Google DeepMind, who we’d like to briefly acknowledge!

  • James Bradbury, Reiner Pope, and Blake Hechtman originally derived many of the ideas in this manuscript, and were early to understanding the systems view of the Transformer.
  • Sholto Douglas wrote the first version of this doc and is responsible for kicking off the project. He is more than anyone responsible for the overall narrative of this doc.
  • Jacob Austin led the work of transforming this first version from rough notes into a more polished and comprehensive artifact. He did much of the work of editing, formatting, and releasing this document, and coordinated contributions from other authors.
  • Most of the figures and animations were made by Anselm Levskaya and Charlie Chen.
  • Charlie Chen wrote the 推理 section and drew many of the 推理 figures.
  • Roy Frostig helped with publication, editing, and many other steps of the journey.

We’d also like to thank many others who gave critical feedback throughout the process, in particular Zak Stone, Nikhil Sethi, Caitlin Stanton, Alek Dimitriev, Sridhar Lakshmanamurthy, Albert Magyar, Diwakar Gupta, Jeff Dean, Corry Wang, Matt Johnson, Peter Hawkins, and many others. Thanks to Ruiqi Gao for help with the HTML formatting.

Thank you all!

Before you go, you might also enjoy reading the new Part 12 on NVIDIA GPUs!

## Further Reading There is a bunch of related writing, including the following:

There remains a lot of room for comprehensive writing in this area, so we hope this manuscript encourages more of it! We also believe that this is a fruitful area to study and research. In many cases, it can be done even without having many hardware accelerators on hand.

## Feedback Please leave comments or questions so that we can improve this further. You can reach our corresponding author, Jacob Austin, at jacobaustin123 [at] gmail [dot] com, or suggest edits by posting issues, pull requests, or discussions on GitHub.

  ### Miscellaneous *Work done at Google DeepMind, now at MatX.

### Citation For attribution in academic contexts, please cite this work as:

``` Austin et al., "How to Scale Your 模型", Google DeepMind, online, 2025.

``` or as a BibTeX entry:

``` @article{scaling-book, title = {How to Scale Your 模型}, author = {Austin, Jacob and Douglas, Sholto and Frostig, Roy and Levskaya, Anselm and Chen, Charlie and Vikram, Sharad and Lebron, Federico and Choy, Peter and Ramasesh, Vinay and Webson, Albert and Pope, Reiner}, publisher = {Google DeepMind}, howpublished = {Online}, note = {Retrieved from https://jax-ml.github.io/scaling-book/}, year = {2025} }

``` Please enable JavaScript to view the comments powered by giscus. © Copyright 2026 . Powered by Jekyll with al-folio theme. Hosted by GitHub Pages.


🔗 相关资源

  1. 官方文档
  2. JAX官方文档
  3. XLA编译优化
  4. TPU技术指南

  5. 参考论文

  6. Transformer原始论文
  7. Attention Is All You Need
  8. Scaling Laws for Neural Language Models

  9. 实践项目

  10. JAX示例代码库
  11. Transformer实现示例
  12. TPU使用教程

💡 学习建议

理论学习

  1. 先通读全文,了解整体框架
  2. 重点理解核心概念和技术原理
  3. 结合图表和公式深入理解

实践学习

  1. 运行教材中的代码示例
  2. 尝试修改参数观察效果
  3. 应用到自己的项目中

深入学习

  1. 阅读参考文献和扩展阅读
  2. 参与相关技术社区讨论
  3. 关注最新的技术发展

本翻译由OpenClaw自动生成,正在不断完善中。 翻译问题反馈:请通过博客反馈渠道联系。