Question Classification with Deep Contextualized Transformer

statement.

Don't use plagiarized sources. Get Your Custom Essay on
Question Classification with Deep Contextualized Transformer
Just from $13/Page
Order Essay

  Early work in this field mainly uses the

Bag-of-words (BoW) to classify sentence

types. Many recent works post

some supervised and deep-learning methods do the question classification with promising results (Lee and Dernoncourt, 2016). However, most of these approaches treat the sentence as text classification, treating each sentence in isolation, causing them to be unable to have a contextual dependence on the words of the sentence. Following the context of sentences, many times would cause a different meaning for a different order of words in the sentence.

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

 This work draws some recent advances in NLP research, like BERT (Jacob et al., 2018) and Elmo (Peters et al., 2018), to produce a sentence classification model to quickly and correctly pick out the question sentence from the target text. Compared with regular algorithms for treating the QA problems, the Self-learning algorithm can do the contextualized word representation to get the contextualized word meaning in the sentences. Specifically, we use the hierarchical deep neural network with the Self-learning algorithm to model different types of question text, including statement questions, which is a specific type of question in the questionnaires. The research works to achieve state-of-the-art performance for classifying the QA problem.

Abstract

The latest work for Question and Answer problems is to use the Stanford Parse Tree. We build on prior work and develop a new method to handle the Question and Answer problem with the Deep Contextualized Transformer to manage some aberrant expression. We conduct extensive evaluations of the SQuAD and SwDA dataset and show significant improvement over QA problem classification of industry needs. We also investigate the impact of different models for the accuracy and efficiency of the problem answers. It shows that our new method is more effective for QA problems with higher accuracy.

Keywords: QA Classification, NLP, Self–learning, Self-attention

Introduction

The Question and Answer system (QA) is widespread in the current industry needs. Every week, one company should face hundreds and thousands of questionnaires for the products they publish. QA is a massive problem in Natural Language Processing (NLP), with the application of problem answers, sentence recognizes, etc. Here are several types of problems, such as Wh-question, statement question, statement, etc. Each type of question has a corresponding label such as question or

We demonstrate how performance could improve with a combination of different level

models: the hierarchical deep neural network

for classification, self-learning and self-attention model like Bert for the single words embedding, and previous label of the training data with the SQuAD dataset. Finally, we explore different methods to find an effective method to classify the QA problem.

2. Related Work

We focus on two primary methods with recent research. One treats text as text classification, in which each utterance is classified isolation, while another one treats the text using Contextualized Word Representation Algorithms, such as BERT with self-attention or Elmo.

Text Classification:Lee and Dernoncourt (2016) build a vector representing for each utterance and use either RNN or CNN to predict the text details to classify the sentence type.

Self-learning: Jacob et al. (2018) used the BERT, and Peters et al. (2018) used Elmo to embed the text to the vector to give the contextual relationship of the sentence for each utterance. Then use RNN-based or CNN-based hierarchical neural networks to learn and model multiple levels of utterance.

3. Model

The task of QA classification takes a sentence S as an input, which varies the length sequence of utterance U= {
u1
,
u2
,
u3
, …,
uN
}. For each utterance
u1∈ 
U, there has a

length value of
li
∈ 
L and a corresponding target label
yi∈ 
Y, which represents the QAs result associated with the corresponding sentence.

 

 Figure 1. The graph of the model Architecture

 Figure 1 shows the overall architecture of the model, which involve several main components. (1) A self-learning Algorithm to encoding the sentence with the self-attention (2) A Combination-level RNN to handle the output of the encoding and classify the label of the sentence. We describe the details below.

3.1 Context-aware Self-learning

Self-learning algorithm encodes a variable-length sentence into a fixed size. There are two types of the algorithm; one base on the Self–Attention and another just base on the deep contextualization word representation.

3.1.1 Deep contextualization word representation

The model uses the BiLM to consider the difference position of utterances within the sequence. Inspired by Peters et al. (2018), we

will encode a variable-length sequence using attention mechanism that considers the different position, token, and segment within the sequence. Inspired by Devin et al. (2018) and Tran et al. (2017), we use the Combination-Level RNN (Section 3.2) into a self-attractive encoder (Lin et al. 2017). The result of the encoding will get out a 2-D vector for each sentence. We follow the instruction of Vipul Raheja and Joel Tetreault (2019) and Joel Tetreault and Liu et al. (2019) to explain the modification below.

 The utterance ti is also mapped into the embedding layer and result in s-dimensional embedding for each word in sequence based on the Transformer (Vaswani et al. 2017). Then the embedding would be put into the bidirectional-GRU layer.

 Based on Vipul Raheja and Joel Tetreault (2019) describe, the contextual self-attention score can compute as:

(2)

Here WS1 is a weight matrix, WS2 and WS3 is a matrix of parameters. b is a bias of vector representing. Equation 2 can be treated as a 2-layer MLP with bias, and da with hidden unit.

3.2 Combination-level RNN

The utterance representation hi from past two models are pass into the combination-level RNN. As Figure 1, we would pass all of the hidden layers concatenated into a final representation Ri of each utterance. Then we put into CRF layer to figure out the relationship between label and the context of the utterances. It is not independently decoding of the label of the utterances; it should consider all of the relationships of the sentences, then

use PCA and t-SNE to reduce the dimensions from a higher level to a lower level. Then we use the Combination-Level RNN (Section 3.2) which provide us the previous hidden state of utterance encode. It provides us the context relationship in the sentences and combines all hidden states of words in sentences. After that, the deep contextualization word representation encoder encodes the combination into the 2-D vectors of each sentence. We follow the instruction of the Peters at el. (2018) to explain our modification below.

 An utterance ti, which is the sequence of the sentence, is mapping into the embedding layer. The deep contextualization representation uses biLM to combine the forward and backend LM. The formulation of the process:

 Moreover, we weight the perform of the model with computing follows:

 In (1), the sjtask is softmax-normalized weights, and the scalar parameter γtask allows the task model to scale the entire vector. In the simple case,the representation would choose the top layer and E(Rk) = .

3.1.2 Self-Attention

For each word in utterances, we would use some Self-Attention model to encode them, and the most popular Self-Attention model base on BERT (Devin et al. 2018). The model

use the self- attention for the task. The several the Natural Language Toolkit Dataset (NLTK) (Steven Bird and Edward Loper, 2002) as another significant resource for the test case. We use the training, validation, and test splits

as defined in Lee and Dernoncourt (2016).

Dataset

Train

Valida-tion

Test

|T|

|N|

SwDA+

SQuAD

87k

10k

3k

43

100k

NLTK

8.7k

1k

0.3k

15

10k

 Table 1. Number of Sentences in the Dataset. |T| represents the number of classes and |N| represents the sentence size

 Table 1 shows the statistics for both datasets. They both exist many kinds of the labels of the class to classify the kind of sentences they are. There are some special DA classes in both datasets, such as Tag-Question in SwDA and Statement-Question in NLTK. Both datasets make over 25% of the question type labels in each set.

5. Result

We compare the classification accuracy of our model with several other models (Table 2). For methods use attention and deep contextualization word representation in some approach to model the sentence of questionnaires documents, even some of them

Model

SwDA+SQuAD

NLTK

TF-IDF GloVe (2014)

67.5

60.3

Li and Wu (2016)

79.2

Elmo

RoBERTa

Lee and Dernoncourt (2016)

75.9

69.4

Our Method

Table 2.  QA Classification Accuracy of the different approaches

give out the most related decoder to decode them to the related labels.

3.3 Super-attractive

The model that we use combines the all final representative of the combination for hidden layers by the self-learning and self- attention. It can help us figure out what the labels of those utterances and give out the result. The score we compute for the algorithm is to calculate the accuracy of the correct labels in the classifications though Hossin M. and Sulaiman M.N. (2015) suggests. Also, we apply some advanced check for the question and answer problem. For those that are an unsure sentence, we would put them into the parser tree to have another classification. The parser tree we use is based on the Huang (2018). We use its Tensor Product Representation to rebuild our parser tree for our model. In our model, we use the bi-LSTM with the attention algorithm to rebuild the parser tree and get the tree graph with POS tags, which is useful to calcify the structure of the sentence. After that, we use the graph we get to analyze the structure of utterances and give out the classification of the unsure sentence in the document. Finally, we will give out the combination result to the users to check the question and answer problems.

4. Data

We evaluate the accuracy of the classification model with one standard dataset, the Switchboard Dialogue Act Corpus (SwDA) (Jurafsky et al., 1997) consisting of 43 classes, and made the word extension with the Stanford Question Answering Dataset use

 

6. Conclusion

We developed a new model which perform the QA classification with attention and make comparisons with the common-use algorithms by testing with the SwDA dataset. We experience different utterance representation method and show that the context details highly depend on the classification performance. Working with attention and combination level to the classification, which has not previously been applied in this kind of task, enable the model can learn more from the context and get more real meaning of words in utterance than before. It helps improve the performance of the classification for those kinds of tasks.

 As future work, we would try more attention mechanisms, such as block self-attention (Shen et al., 2018b), or hierarchical attention (Yang et al., 2016), hypergraph attention (Song et al. 2019). Because they can incorporate the information from different representation for the various position and they can capture both local and long-range context dependency.

use the self- attention for the task. The several

Reference

Ji Young Lee and Franck Dernoncourt. 2016. Sequential short-text classification with recurrent and convolutional neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 515–520. Association for Computational Linguistics.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.

Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations.

Dan Jurafsky, Liz Shriberg, and Debra Biasca. 1997. Switchboard SWBD-DAMSL shallow-discoursefunction annotation coders manual, draft 13. Technical report, University of Colorado at Boulder Technical Report 97-02.

Rajpurkar P, Zhang J, Lopyrev K, Liang P. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text. arXiv [cs.CL].

Steven Bird and Edward Loper. July 2002. NLTK: The Natural Language Toolkit. arXiv [cs.CL]

Vipul Raheja and Joel Tetreault. May 2019. Dialogue Act Classification with Context-Aware Self-Attention. arXiv [cs.CL]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. July 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv [cs.CL]

Quan Hung Tran, Ingrid Zukerman, and Gholamreza Haffari. 2017. A hierarchical neural model for learning sequences of dialogue acts. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 428–437. Association for Computational Linguistics.

Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. In International Conference on Learning Representations 2017 (Conference Track).

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NIPS 2017, 4-9 December 2017, Long Beach, CA, USA, pages 6000–6010.

Wei Li and Yunfang Wu. 2016. Multi-level gated recurrent neural network for dialog act classification. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1970–1979. The COLING 2016 Organizing Committee.

Hossin M. and Sulaiman M.N. March 2015, A REVIEW ON EVALUATION METRICS FOR DATA CLASSIFICATION EVALUATIONS

. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.5, No.2

Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543. Association for Computational Linguistics.

Qiuyuan Huang, Li Deng, Dapeng Wu, Chang Liu, and Xiaodong He. Feb 2018. Attentive Tensor Product Learning. arXiv [cs.CL].

Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, and Chengqi Zhang. 2018b. Bi-directional block selfattention for fast and memory-efficient sequence modeling. In International Conference on Learning Representations.

Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1480–1489. Association for Computational Linguistics.

Song Bai, Feihu Zhang, and Philip H.S. Torr. Jan 2019. Hypergraph Convolution and Hypergraph Attention. arXiv [cs.CL].

 
 

What Will You Get?

We provide professional writing services to help you score straight A’s by submitting custom written assignments that mirror your guidelines.

Premium Quality

Get result-oriented writing and never worry about grades anymore. We follow the highest quality standards to make sure that you get perfect assignments.

Experienced Writers

Our writers have experience in dealing with papers of every educational level. You can surely rely on the expertise of our qualified professionals.

On-Time Delivery

Your deadline is our threshold for success and we take it very seriously. We make sure you receive your papers before your predefined time.

24/7 Customer Support

Someone from our customer support team is always here to respond to your questions. So, hit us up if you have got any ambiguity or concern.

Complete Confidentiality

Sit back and relax while we help you out with writing your papers. We have an ultimate policy for keeping your personal and order-related details a secret.

Authentic Sources

We assure you that your document will be thoroughly checked for plagiarism and grammatical errors as we use highly authentic and licit sources.

Moneyback Guarantee

Still reluctant about placing an order? Our 100% Moneyback Guarantee backs you up on rare occasions where you aren’t satisfied with the writing.

Order Tracking

You don’t have to wait for an update for hours; you can track the progress of your order any time you want. We share the status after each step.

image

Areas of Expertise

Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.

Areas of Expertise

Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.

image

Trusted Partner of 9650+ Students for Writing

From brainstorming your paper's outline to perfecting its grammar, we perform every step carefully to make your paper worthy of A grade.

Preferred Writer

Hire your preferred writer anytime. Simply specify if you want your preferred expert to write your paper and we’ll make that happen.

Grammar Check Report

Get an elaborate and authentic grammar check report with your work to have the grammar goodness sealed in your document.

One Page Summary

You can purchase this feature if you want our writers to sum up your paper in the form of a concise and well-articulated summary.

Plagiarism Report

You don’t have to worry about plagiarism anymore. Get a plagiarism report to certify the uniqueness of your work.

Free Features $66FREE

  • Most Qualified Writer $10FREE
  • Plagiarism Scan Report $10FREE
  • Unlimited Revisions $08FREE
  • Paper Formatting $05FREE
  • Cover Page $05FREE
  • Referencing & Bibliography $10FREE
  • Dedicated User Area $08FREE
  • 24/7 Order Tracking $05FREE
  • Periodic Email Alerts $05FREE
image

Our Services

Join us for the best experience while seeking writing assistance in your college life. A good grade is all you need to boost up your academic excellence and we are all about it.

  • On-time Delivery
  • 24/7 Order Tracking
  • Access to Authentic Sources
Academic Writing

We create perfect papers according to the guidelines.

Professional Editing

We seamlessly edit out errors from your papers.

Thorough Proofreading

We thoroughly read your final draft to identify errors.

image

Delegate Your Challenging Writing Tasks to Experienced Professionals

Work with ultimate peace of mind because we ensure that your academic work is our responsibility and your grades are a top concern for us!

Check Out Our Sample Work

Dedication. Quality. Commitment. Punctuality

Categories
All samples
Essay (any type)
Essay (any type)
The Value of a Nursing Degree
Undergrad. (yrs 3-4)
Nursing
2
View this sample

It May Not Be Much, but It’s Honest Work!

Here is what we have achieved so far. These numbers are evidence that we go the extra mile to make your college journey successful.

0+

Happy Clients

0+

Words Written This Week

0+

Ongoing Orders

0%

Customer Satisfaction Rate
image

Process as Fine as Brewed Coffee

We have the most intuitive and minimalistic process so that you can easily place an order. Just follow a few steps to unlock success.

See How We Helped 9000+ Students Achieve Success

image

We Analyze Your Problem and Offer Customized Writing

We understand your guidelines first before delivering any writing service. You can discuss your writing needs and we will have them evaluated by our dedicated team.

  • Clear elicitation of your requirements.
  • Customized writing as per your needs.

We Mirror Your Guidelines to Deliver Quality Services

We write your papers in a standardized way. We complete your work in such a way that it turns out to be a perfect description of your guidelines.

  • Proactive analysis of your writing.
  • Active communication to understand requirements.
image
image

We Handle Your Writing Tasks to Ensure Excellent Grades

We promise you excellent grades and academic excellence that you always longed for. Our writers stay in touch with you via email.

  • Thorough research and analysis for every order.
  • Deliverance of reliable writing service to improve your grades.
Place an Order Start Chat Now
image

Order your essay today and save 30% with the discount code Happy