Skip to main content
Ctrl
+
K
eKonomic Research Python Toolkit
Basics
Getting started
Installation
Usage
Key features
Easy Configuration
No Boilerplate
Workflows
Sharable and Reproducible
Pluggable Architece
Tutorials
Configuring ekorpkit
Using eKonf class
Datasets
Build and Load Corpora
Build and Load Datasets
Corpus task pipelines
Models
Auto ML
Word Embeddings
Word2Vec Basics
Evaluate pretrained embeddings
N-Grams
N-Grams
N-Gram model for ngram lexicon features
N-Gram model for unigram lexicon features
Sentiment Analyers
Lexicon-based Sentiment Analysis
Evaluate LM with financial phrasebank
LM Dictionary vs. finbert vs. T5
Transformers
Pipelines
Instantiating pipeline
Preprocessors
Normalizers
Normalizers
Segmenters
Segmenters
Tokenizers
Mecab Tokenizer
Visualizers
Plots
Workflows
Datasets
The eKorpkit Corpus
Use Cases
FOMC
Preparing Numerical Data
Preparing Textual Data
EDA on Numerical Data
Visualizing Features
Checking Baseline with AutoML
Predicting Sentiments of FOMC Corpus
EDA on Sentiment Data
Predicting the next decisions with tones
Bank of Korea
EDGAR
Prediciting Sentiments
ESG
Preparing training datasets
Improving classification datasets
Training Classifiers for ESG Ratings
Building
econ_news_kr
corpus
Predicting ESG Categories and Polarities
Cross validating datasets
Preparing active learning data
Putting them together in a pipeline
ESG (English)
Building
JOCo
corpus
Taxation on Cryptocurrency
Improving classification datasets
Research
ESG Topic Analysis
Rethinking ESG
Lectures
Introduction to NLP
Introduction
Getting started with ekorpkit
Research Applications
Language Models
Topic Modeling
Topic Models
Topic Coherence Measures
Sentiment Analysis
Tokenization
Word Segmentation and Association
Vector Semantics and Representation
Word Embeddings
Lab 1: Preparing Wikipedia Corpora
Lab 2: EDA on Corpora
Deep Learning for NLP
Introduction
Getting started with ekorpkit
Zero Shot, Prompt, and Search Strategies
What is BLOOM?
Bloom Examples
Transformers
BERT: Bidirectional Encoder Representations from Transformers
T5: Text-To-Text Transfer Transformer
Tokenization
SentencePiece Tokenizer
ByT5: Towards a token-free future with pre-trained byte-to-byte models
Pretrained Language Models
Lab 1: Preparing Wikipedia Corpora
Lab 2: EDA on Corpora
Lab 3: Training Tokenizers
Lab 4: Pretraining Language Models
NLP Applications
AI Art (Text-to-Image)
Introduction
Project Themes - A Brave New World
DALL·E 1
DALL·E 2
Imagen
DALL·E Mini
Stable Diffusion
Prompt Generator for Stable Diffusion
Textual Inversion (Dreambooth)
Automatic Speech Recognition (Whisper)
Text to Music
Image to Music
Machine Learning Systems Design
Development Environment
Dotfiles
Data Science for Economics and Finance
Projects
Robot Drawing System
Introduction to Robot Drawing Systems
Proposal for a Robot Drawing System
Reference - Sketch Generation with Drawing Process Guided by Vector Flow and Grayscale
Reference - Avatar-GAN (RoboCoDraw)
Reference - Fast Robotic Pencil Drawing
About
entelecheia.me
Bibliography
Citation
Repository
Open issue
.md
.pdf
Segmenters
Contents
Section table of contents
Segmenters
#
Section table of contents
#
Segmenters
Contents
Section table of contents