Computational Linguistics/Natural Language Processing

Course Description

이 과목에서는 자연어처리(Natural Language Processing) 또는 컴퓨터언어학(Computational Linguistics)의 이론적인 기초에서부터 최근의 Transformer와 이를 활용한 사전학습모델에 기반한 방법론을 학습한다. 언어모델(Language Model) 관점의 여러 방법론이 다루어지며 강의 전반부에서는 N-gram, Entropy, Embedding, Text Classification 등에 관한 내용이 다루어지며, 후반부에는 Sequence-to-Sequence Model, Attention, Transformer가 중점적으로 다루어진다. Transformer와 이를 활용한 BERT, GPT 등 여러 사전학습모델을 학습하고 Huggingface의 Transformers를 활용하는 방법과 이를 분류, 요약, 생성, 질의어응답, 챗봇 등과 같은 다양한 태스크에서 활용할 수 있는 언어처리 능력을 키우도록 한다.

Useful Sites

Lectures

PyTorch

Other Resources

Jupyter notebook

Jupyter notebook for beginners-A tutorial

Bring the best out of Jupyter notebooks for Data science-Enhance jupyter notebook’s productivity with these tips & tricks

Jump out of the Jupyter Notebook with nbconvert

Jupyter Notebook Extensions

Google Colabatory

Textbook and Sites

딥러닝을 위한 자연어처리 입문

huggingface transformers

Huggingface Transformers

DL wizard

Deep Learning Tutorials based on PyTorch

Syllabus

Large Language Models

	Date	Topics	Related Materials and Resources	PyTorch
1	3/3-3/6	Introduction to Natural Language Processing Language Modeling 1- Statistical Language Modeling: N-Grams	Introduction to Natural Language Processing Language Modeling and with N-Grams	PyTorch: Deep Learning With PyTorch: A 60 Minute Blitz Learning PyTorch with Examples Matrices PyTorch 실습
2	3/10-3/15	Language Modeling 1- Statistical Language Modeling: Entropy and Maximum Entropy Models	Entropy is a Measure of Uncertainty
3	3/17-3/21	Text Classification	Text Classification
4	3/24-3/28	Vector Semantics Language Modeling II: Static Word Embedding	Vector Semantics and Embeddings	PyTorch: Linear Regression With PyTorch Logistic Regression With PyTorch
5	3/31-4/4	Language Modeling II: Static Word Embedding	Vector Semantics and Embeddings	PyTorch: Word Embeddings: Encoding Lexical Semantics
6	4/7-4/11	Sequence to Sequence Model: Encoder-Decoder		PyTorch: pytorch-seq2seq Sequence to Sequence Learning with Neural Networks Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation Neural Machine Translation by Jointly Learning to Align and Translate Packed Padded Sequences, Masking, Inference and BLEU Convolutional Sequence to Sequence Learning Attention is All You Need
7	4/14-4/18	Attention Model Neural Machine Translation By Jointly Learning to Align and Translate	Attention: Illustrated Attention	PyTorch: pytorch-seq2seq Sequence to Sequence Learning with Neural Networks Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation Neural Machine Translation by Jointly Learning to Align and Translate Packed Padded Sequences, Masking, Inference and BLEU Convolutional Sequence to Sequence Learning Attention is All You Need
8	4/21-4/25	Language Modeling III: Transformer Self Attention: Attention is All you need	The Illustrated Transformer	PyTorch: The Annotated Transformer
9	4/28-5/2	Language Modeling III: Transformer Self Attention: Attention is All you need	The Illustrated Transformer
10	5/5-5/9	Language Modeling III: - Masked Language Models - Auto-regressive Language Models	Masked Language Models
11	5/12-5/16	Transformers by Huggingface: Quick Tour Summary of Tasks : Sequence Classification, Extractive Question Answering, Language Modeling, Text Generation, Named Entity Recognition, Sumarization, and Translation
12	5/19-5/23	Language Modeling IV:Large Language Models (LLMs) Background for LLMs Technical Evolution of GPT-series Models Resources of LLMs Pre-Training Adaptation of LLMs: Instruction Tuning/Alignment Tuning Utilization: In-Context Learning/Chain-of-Thought Prompting Capacity Evaluation Practical GuideBook of Prompt Design Applications
13	5/26-5/30	Language Modeling IV:Large Language Models (LLMs) Background for LLMs Technical Evolution of GPT-series Models Resources of LLMs Pre-Training Adaptation of LLMs: Instruction Tuning/Alignment Tuning Utilization: In-Context Learning/Chain-of-Thought Prompting Capacity Evaluation Practical GuideBook of Prompt Design Applications	Large Language Models
14	6/2-6/6	Model Alignment, Prompting, and In-Context Learning		DaG(David and Goliath Large Language Model)
15	6-9/6-13	Final Test and Project Presentations

108.413A: 컴퓨터언어학(Computational Linguistics)/자연어처리(Natural Language Processing)

Course Description

Useful Sites

Textbook and Sites

Syllabus