Large Model Fundamentals, Zhejiang University Large Model Fundamentals Textbook (PDF file)

Fundamentals of Large Models provides a comprehensive and systematic account of Large Language Models (LLMs), covering fundamentals, architecture design, training optimization, and application practice.

Starting from the basic theory of language modeling, the book analyzes the model architecture based on statistics, RNN and Transformer. For statistical-based language modeling, the n-gram model and its statistical principles, including Markov assumption and great likelihood estimation, are introduced in detail. When explaining the RNN-based language model, the structure of recurrent neural network is not only elaborated, but also analyzed the gradient vanishing/explosion problem in the training process, as well as its application in language modeling. The mainstream Transformer-based language model is analyzed in detail, including the architecture of Transformer, such as self-attention, FFN, layer normalization, residual connection, etc., as well as the advantages of its application in language modeling.

In addition, the book also focuses on analyzing the architectural types of large language models, such as Encoder – only, Encoder – Decoder, Decoder – only, etc., and introduces the representative models of each of them, such as BERT, T5, GPT series, and so on. Meanwhile, key technologies such as Prompt engineering, efficient fine-tuning of parameters, model editing and retrieval enhancement generation are explained in detail, and their applications in different scenarios are demonstrated with practical cases to help readers better understand and master how to effectively apply the technology of large language models. In terms of language model evaluation, intrinsic evaluation methods, such as perplexity, and extrinsic evaluation methods, such as BLEU, ROUGE, BERTScore, G – EVAL, etc., are also introduced.

The download links for the book are: https: //github.com/LLMBook-zh/LLMBook-zh.github.io/blob/main/LLMBook.pdf and http://aibox.ruc.edu.cn/zws/index.htm.