Deep Learning for NLP#
📜 Course Description#
This course aims to cover cutting-edge deep learning methods for natural language processing.
The topics include word embeddings and contextualized word embeddings; language models; transformers; pre-training and fine-tuning; sequence tagging (NER, question answering); sequence generation (summarization, machine translation), zero-shot learning, etc.
♾️ Learning Goals#
By the end of this course, students will be able to:
Understand fundamental concepts in natural language processing, including text representation, sequence modeling, and neural machine translation.
Understand the state-of-the-art deep learning methods for natural language processing, including word embeddings and contextualized word embeddings; language models; transformers; pre-training and fine-tuning; sequence tagging (NER, question answering); sequence generation (summarization, machine translation), zero-shot learning, etc.
Implement key algorithms in natural language processing using deep learning frameworks such as PyTorch or TensorFlow.
Train and tune state-of-the-art models on large-scale datasets.
Read and understand recent research papers in natural language processing.
🏆 Grading#
Participation: 10%
Midterm: 30%
Term Project: 60%
🧠 Term Project#
Students will be required to complete a term project as part of this course.
At the midterm, you need to submit a proposal for your project.
The proposal includes the data, and you have to perform the exploratory data analysis on this data.
The term project can take the form of either a research paper or a practical implementation.
For the term project, you need to submit a report and a codebase.
The report should be around 10 pages, and it should describe your methodology, results, and discussion.
The project will be graded on correctness, readability, and efficiency.
📒 Lecture Notes#
You can find the lecture notes of the course by clicking on the following link:
https://entelecheia.github.io/ekorpkit-book/docs/lectures/deep_nlp
🎲 The Whole Game#
Harvard Professor David Perkins’s book, Making Learning Whole, popularized the idea of “teaching the whole game.”
We don’t require kids to memorize all the rules of baseball and understand all the technical details before we let them play the game.
Rather, they start playing with a just general sense of it, and then gradually learn more rules/details as time goes on.
This course takes this approach to deep learning.
Most courses on deep learning focus only on what the network “is” and how it works.
This course is different: instead of teaching just the network, we show how to use it to solve problems.
We start by teaching a complete, working, very usable deep learning network using simple, expressive tools. Then we show how to use it to solve real-world problems.
This approach has several advantages:
It makes deep learning more accessible and understandable. Students can see how deep learning can be used in practice, and they can immediately start using it to solve their own problems.
It helps students learn the whole game of machine learning, not just deep learning. In addition to showing how to use a state-of-the-art deep learning network, we also teach important concepts such as data preprocessing, model evaluation, and deployment.
It gives students a strong foundation for further study. Because the course covers both the theory and practice of deep learning, students will be well prepared for more advanced courses on the subject.
🗓️ Table of Contents#
- Introduction
- Getting started with ekorpkit
- Zero Shot, Prompt, and Search Strategies
- What is BLOOM?
- Bloom Examples
- Transformers
- BERT: Bidirectional Encoder Representations from Transformers
- T5: Text-To-Text Transfer Transformer
- Tokenization
- SentencePiece Tokenizer
- ByT5: Towards a token-free future with pre-trained byte-to-byte models
- Pretrained Language Models
- Lab 1: Preparing Wikipedia Corpora
- Lab 2: EDA on Corpora
- Lab 3: Training Tokenizers
- Lab 4: Pretraining Language Models