Introduction:
When we process sentences, some NLP models extract semantic level feature, like word2vec or N-gram models. In this work, it encodes sentences to character-level feature, which performs better than the former feature.
ConvNet
The main component of ConvNet is convolutional module, which computes a 1-D convolution between input and output.
The idea is briefly illustrated in this figure.
Character Quantization
Given a sentence, we quantize each character using 1-of-m encoding, where m is the number of alphabets. It's very simple but it works like Braille, which helps blind people reading.
If a sentence is longer than L characters, we remove those exceeding characters.
We use
Model Design
We design two ConvNets, which both have 6 convolutional layers and 3 fc layers.
The difference is the frame size. Here is the illustration of model.
Data Augmentation
The size of text data is always annoying. We need to do data augmentation if we have no sufficient data. Here we replace some words in a sentence with their synonyms.
Dataset
Here we use 5 datasets to evaluate our method.
(1) DBpedia, which has 14 classes, 560K training , 70K testing
(2) Amazon reviews, which has 5 classes, 3M training and 650K testing.
(3) Yahoo! Answers, which has 10 classes, 1.4M training, 60K testing.
(4) AG's news corpus, which has 4 classes, 120K training, 7.6K testing.
(5) Sogou News, which has 5 classes, 360K training, 60K testing.
Here we show these the result of (1):
Here we show these the result of (2):
Here we show these the result of (3):
Here we show these the result of (4):
Here we show these the result of (5):