SciELO - Scientific Electronic Library Online

 
vol.22 número3Discovering Continuous Multi-word Expressions in CzechArabic Dialect Identification based on Probabilistic-Phonetic Modeling índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Computación y Sistemas

versão On-line ISSN 2007-9737versão impressa ISSN 1405-5546

Comp. y Sist. vol.22 no.3 Ciudad de México Jul./Set. 2018

https://doi.org/10.13053/cys-22-3-3023 

Articles of the Thematic Issue

Using BiLSTM in Dependency Parsing for Vietnamese

Luong Nguyen Thi1 

Linh Ha My2 

Huyen Nguyen Thi Minh2 

Phuong Le-Hong2 

1 Dalat University, Lamdong, Vietnam

2 VNU University of Science, Hanoi, Vietnam


Abstract:

Recently, deep learning methods have achieved good results in dependency parsing for many natural languages. In this paper, we investigate the use of bidirectional long short-term memory network models for both transition-based and graph-based dependency parsing for the Vietnamese language. We also report our contribution in building a Vietnamese dependency treebank whose tagset conforms to the Universal Dependency schema. Various experiments demonstrate the efficiency of this method, which achieves the best parsing accuracy in comparison to other existing approaches on the same corpus, with unlabeled attachment score of 84.45% or labeled attachment score of 78.56%.

Keywords: Deep learning; BiLSTM; dependency parsing; Vietnamese

1 Introduction

Dependency parsing consists of graph-based and transition-based parser (Kubler et al., 2009). Given sentence s, a graph-based algorithm finds the highest scoring parse tree from all possible outputs while a transition-based algorithm builds a parse by a sequence of actions. In recent years, many researchers have developed deep learning approaches with high accuracy in English, Chinese, etc. Chen and Manning proposed a novel way of learning a neural network classifier in a greedy, transition-based dependency parser which achieved USA=92.2% and LSA=89.7% on the English Penn Treebank [1].

Dyer et al. (2015) [3] also presented stack LSTMs, recurrent neural networks for sequences, with push and pop operations, and used them to implement a state-of-the-art of transition-based dependency parser with USA=93.2% and LSA=90.9% in English. Kiperwasser et al. (2016) [5] presented a simple and effective scheme for dependency parsing based on bidirectional-LSTMs (BiLSTMs) which had USA=93.8% and LSA=91.5% for English. Besides, Dozat and Manning (2016) [2] have recently inherited from Kiperwasser et al. using neural attention in a simple graph-based dependency parser. Their parser gained a state-of-the-art or its performance on standard treebanks in six different languages, achieving 95.7% UAS and 94.1% LAS on the most popular English PTB dataset.

Regarding Vietnamese dependency parsing, there have been many contributions to parsing. In 2008, Nguyễn Lê Minh et al. [12] used MST parser on a corpus consisting of 450 sentences. Then, in 2012, Phuong Le et al. [6] applied a lexicalized tree-adjoining grammar parser trained on a subset of the Vietnamese treebank. In 2013, Thi-Luong et al. [18] used MaltParser on a Vietnamese dependency treebank which is converted automatically from a Vietnamese treebank. One year later, Dat et al. [14] also presented a new conversion method to automatically transform a constituent-based Vietnamese Treebank into dependency trees.

In 2015, Phuong Le et al. [8] improved accuracy of Vietnamese dependency parsing, used distributed word representations with Skip-gram and GloVe model for transition-based dependency parsing. In 2016, Thi-Luong et al. [16] also used distributed word representations with Skip-gram in graph-based dependency parsing for Vietnamese and Dat et al. [13] presented an empirical study for Vietnamese dependency parsing. In 2017, Kiem Hieu [15] presented their work on building BKTreebank, a dependency treebank for Vietnamese.

1.1 Transition-Based Dependency Parsing

The transition system has a set of configurations and a set of transitions which are applied to configurating. By parsing a sentence, the system is initialized to an initial configuration based on the input sentence, and transitions are repeatedly applied to this configuration. After a finite number of transitions, the system arrives at a terminal configuration, and a parse tree is read off the terminal configuration. In a greedy parser, a classifier is used to choose the transition and take in each configuration, based on features extracted from the configuration itself. The parsing algorithm is presented in Algorithm 1 below.

Algorithm 1 Greedy transition-based parsing 

Many transition-based systems [7] are popular such as arg-eager algorithm, arg-standard algorithm. However in this work, we employ the arc-hybrid system which is similar to these. In the arc-hybrid system, a configuration c = (α, β, T) consists of a stack α, a buffer β, and a set T of dependency arcs.

Both the stack and the buffer hold integer indices pointing to sentence elements. Given a sentence s = w1, w2,..., wn, the system is initialized with an empty stack, an empty arc set, and β = 1,..., n, ROOT, where ROOT is the special root index. Any configuration c with an empty stack and a buffer containing only ROOT is terminal, and the parse tree is given by the arc set Tc of c. The arc-hybrid system allows 3 possible transitions, SHIFT, LEFT and RIGHT, defined as:

  • SHIFT[(α, b0|β, T)] = (α|b0, β, T),

  • LEFTl[(α|s1|s0, b0|β, T)] = (α|s1, b0|β, T ∪ {(b0, s0, l)}),

  • RIGHl[(α|s1|s0, β, T)] = (α|s1, β, T ∪{(s1, s0, l)}).

1.2 Graph-Based Dependency Parsing

The second approach is the graph-based dependency parsing algorithm introduced by McDonald et al. [11]. In this algorithm, the weights of the edges are calculated for building dependency graphs of a sentence as follows:

s(i,j)=wf(i,j),

where w is the weight of the (i, j) edge, f(i, j) is feature of (i, j) edge. The weight of (i, j) edge represents the ability to create a dependency between the head (wi) and the dependence (wj). If the arc score function is known, then the weight of graph is:

S(G=(V,E))=(i,j)s(i,j).

Then, based on the weights of all edges in graph, McDonald et al. [10] showed that this problem is equivalent to finding the highest scoring directed spanning tree for the graph G originating out of the root node 0.

1.3 Long Short-Term Memory

Recurrent Neural Network. The recurrent neural network (RNN) is a class of artificial neural network designed for sequence labeling task. It takes input as a sequence of vector and returns another sequence. The simple architecture of RNN has an input layer x, hidden layer h and output layer y. At each time step t, the values of each layer are computed as follows:

ht=fh(Wihxt+Whhht1),yt=fo(Whoht),

where Wih, Whh and Who are the three connection weight matrices and fh and fo that are sigmoid and softmax are the hidden and output unit activation functions.

Long Short-Term Memory. Long Short-Term Memory (LSTM) was first proposed in 1997 by Sepp Hochreiter et al. [4]. LSTM is an extended model of RNN which is designed to combat with these vanishing and exploding gradient problems when learning with long-range sequences. LSTM networks are the same as RNN, except that the hidden layer updates are replaced by memory cells. Figure 1 shows a LSTM cell, including i, f, o are the input,forget and output gates, respectively. c and c˜ denote the memory cell content. LSTM cell calculates a hidden state st as following equations:

Fig. 1 Long Short-Term Memory cell 

where σ is the element-wise sigmoid function and ⊙ is the element-wise product, i, f, o and c are the input gate, forget gate, output gate, and cell vector respectively. Ui, Uf, Uc, Uo are connection weight matrices between input x and gates, and Wi, Wf, Wc, Wo are connection weight matrices between gates and hidden state h.

Bidirectional Long Short-Term Memory. The original LSTM uses only previous contexts for prediction. For many sequence labeling tasks, it is advisable to take the contexts from two directions. Bidirectional LSTM utilizes both the previous and future context by processing the sequence in two directions, and generate two independent sequences of LSTM output vectors.

2 Approach

2.1 Universal Dependency Parsing in Vietnamese

2.1.1 Universal Dependency

The dependency label represents the dependence between the two words in the sentence. Each pair of words, in different positions, will have a different dependency label. There is a general conversion rule to do the dependency label which is uniform throughout the language. There are many sets of relational labels for a language which are different from each others.

The Universal dependencies - UD1 was developed by the Stanford University team, Marneffe et al. [9]. This is a project developed based on the treebank annotation for multi-language, with the goal of facilitating the development of multilingual parsing, cross-language learning, research and analysis from the perspective of the type of language. This project was developed based on the Stanford Dependency - SD dependency labels, also by the Stanford University team (Marneffe et al., 2015) based on multi-lingual labels (Petrov et al., 2012) and the magnetic word form (Zeman, 2008).

The general objective of developing Universal dependencies is to provide a labels set and guidelines to facilitate the construction of of similar works for other languages and allow expansion to a new language. The labels in SD are organized in groups of subject, object, clauses, word definitions, or nouns. Stanford offers nearly 50 types of English dependencies based-on PennTreebank corpus. All of these dependencies are twofold: between a head word and its dependent word. Each relation is given by three components: dependency label, head word and dependent word.

Universal dependencies can be applied to many different languages, which can be used to suggest improvements in dependency parsing, even for English. This research team has developed a core label set that has been extensively tested in a variety of languages, meaning that this core label set can be applied in many different languages. It is also possible to add new labels as needed by categorizing special linguistic relationships, or for individual cases of one or more groups of languages. This label set may correspond to many different languages such as English, French, German, Chinese.... This label is useful because it can indicate a dependency for the same sentence, in different languages.

Universal dependencies contain 40 labels that were organized to allow principles of the UD taxonomy such that rows correspond to functional categories in relation to the head (core arguments of clausal predicates, non-core dependents of clausal predicates, and dependents of nominals) while the columns correspond to structural categories of the dependent (nominals, clauses, modifier words, function words) as in Table 1. All of Universal dependencies are defined and there are specific examples that can use to develop and build a complete label for the others language.

Table 1 Dependencies in universal Stanford Dependencies 

Nominals Clauses Modifier words Function Words
Core arguments nsubj csubj
obj ccomp
iobj xcomp
Non-core arguments nsubj csubj
obl advcl advmod aux
vocative discourse
expl
dislocated
Coordination MWE Loose Special Other
conj fixed list orphan punct
cc flat parataxis goeswith root
compound reparandum dep

2.1.2 Vietnamese Dependencies

Based on universal dependencies and Viettree-bank, we have built Vietnamese dependencies. This set has labels that coincide with the labels in the UD and several new labels. The Vietnamese dependencies set has 46 labels. Some of the dependent labels that we have designed specifically for Vietnamese:

  • — csubj: asubj (adjective subject: A adjective subject is an adjective phrase which is the syntactic subject of a clause. In Vietnamese, the subject is usually a noun (or a noun phrase), but there are some cases adjectives be the subject:

    • - Xa_xa là hố bom.

  • csubj: vsubj (verb subject): This is used to describe the phenomenon as a verb is a subject of a sentence. In Vietnamese, the subject is usually a noun, but there are some cases adjective, verb, clause can do the subject of a sentence:

    • - Học tâp là nhiệm vụ chính → csubj:vsubj(là, học tâp)

  • nc (classifier noun): This relation represents the relationship between a classifier noun with common nouns. The classifier noun always stands before the common noun, for example, “cái”, “con ”...

    • - Hai con mèo đen đang ăn cá. → nc(mèo, con)

  • vnom (verb nominal): This is used for the relationship between a verb moninal and a classifier noun. The classifier noun is always before the verb. Example: “cái”, “sụ”, “việc”,...

    • - Cái ăn khan hiếm quá! → vnom(ăn, cái)

Then, we have a comparison between the two sets of labels under Tables 2 and 3.

Table 2 Comparison between Vietnamese dependencies (VD) and Universal dependencies (UD), part 1 

VD (2016) UD (2015) Meaning
csubj csubj Clausal subject
csubj:asubj
csubj:vsubj
acomp xcomp Adjectival complement
amod amod Adjectival modier
apredmod advmod Adjectival modier of a predicate
advmod advmod Adverbial modier
advcl advcl Adverbial clause modier
aux aux Auxiliary
auxpass auxpass Passive auxiliary
appos appos Appositional modier
cc cc Coordination
ccomp ccomp Clausal complement
conj conj Conjunct
cop cop Copula
dep dep Dependent
det det Determiner
discourse discourse Discourse element
dislocated dislocated Dislocated elements
dobj dobj Direct object
foreign foreign Foreign words
iobj iobj Indirect object
list list List
mark mark Marker
neg neg Negation modier

Table 3 Comparison between Vietnamese dependencies (VD) and Universal dependencies (UD), part 2. 

VD (2016) UD (2015) Meaning
nn compound Noun compound modier
nsubj nsubj Nominal subject
num nummod Numeric modier
number compound Element of compound number
parataxis parataxis Parataxis
pcomp mark Prepositional complement
pobj case Object of a preposition
prep nmod Prepositional modier
punct punct Punctuation
remnant remnant Remnant in ellipsis
reparandum reparandum Overridden disfluency
rcmod acl:relcl Relative clause modier
ref ref Referent
root root root
tmod nmod:tmod Temporal modier
vcomp ccomp Verb complement of a verb
vmod amod:vmod Verb modier of an NP
vocative vocative Vocative
xcomp xcomp Open clausal complement
nsubjpass nsubjpass Passive nominal subject
csubjpass csubjpass Clausal passive subject
- expl Expletive
- goeswith Goes with
nc - Classifier noun
vnom - Verb nominal

2.2 BiLSTM in Dependency Parsing

2.2.1 Using BiLSTM Feature Representation

Instead of using direct feature vectors in dependency parsing, we use the same method in [5]. Each of feature vectors by its BiLSTM encoding, and uses a concatenation of a minimal set of such BiLSTM encodings as a feature function, which is then passed to a non-linear scoring function (multi-layer perceptron).

Give input sentence s with n words: w1,..., wn and the corresponding POS tags p1,..., pn . Each word wi and POS pi with embedding vectors e(wi) and e(pi) and denote x1:n is a sequence of input vectors with:

xi=e(wi)e(pi).

The embedding are trained together with the model. We alse denoted vi is the output of this model. vi is computed as follows:

vi=BiLSTM(x1:n,i).

A Bidirectional LSTM composed of two LSTMs: LSTMf and LSTMb. The LSTMf reads the sequence in its regular order and the LSTMb reads it in reverse. Concretely, given a sequence of vectors x1:n and index i, the function BiLSTMθ(x1:n, i) is defined as:

BiLSTMθ(x1:n,i)=LSTMf(x1:i)LSTMb(xn:i),vi=BiLSTMθ(x1:n,i).

The feature function φ is then the concatenation of a small number of BiLSTM vectors. The resulting feature vectors are then scored using a non-linear function, namely a multi-layer perceptron with one hidden layer (MLP):

MLPθ(x)=W2tanh(W1x+b1)+b2,

where θ = W2, W1, b2, b1 are the model parameters.

2.2.2 Transition-Based Dependency Parsing uses BiLSTM Feature Representation

Given a sentence s, the transition-based parser is initialized with configuration c. Then, a feature function φ(c) represents the configuration c as a vector. The feature function is the concatenated BiLSTM vectors of the some items on the stack and the buffer. For example, for a configuration c = (...|s2|s1|s0, b0|..., T) the feature extractor is the top 3 items on the stack and the first item on the buffer. It is defined as:

ϕ(c)=vs2vs1vs0vb0,vi=BiLSTM(x1:n,i).

Each transition is scoring using an MLP that is fed the BiLSTM encodings of vectors that are gotten from the feature extractor. Each xi is concatenation of a word and a POS vector. SCORE assigning scores to (configuration, transition) pairs. SCORE scores the possible transition t = Shift, Left_Arc, Right_Arc, and the highest scoring transition t^ is chosen. The transition t^ is applied to the configuration that will output a new configuration.

2.2.3 Graph-Based Dependency Parsing uses BiLSTM Feature Representation

In graph-based parsing, the weights of the edges are calculated for building dependency graphs of s = x1:n a sentence as follows:

predict(s)=[arg maxyY(s)scoreglobal(s,y)],scoreglobal(s,y)=partyscorelocal(s,part),

where space Y(s) of valid dependency trees over s.

Arc-factored parsing decomposes the score of a tree to the sum of the score of its head-modifier arcs (h, m):

parse(s)=[argmaxyY(s)(h,m)yscore(ϕ(s,h,m))],

where φ(s, h, m) is the feature extractor which uses the BiLSTM encoding of the head word and the modifier word: φ(s, h, m) = BiLSTM(x1:n, h) ◦ BiLSTM(x1:n, m).

The final model is:

parse(s)=arg maxyY(s)(h,m)yscore(ϕ(s,h,m))=arg maxyY(s)(h,m)yMLP(vhvm),vi=BiLSTM(x1:n,i).

3 Experiments

3.1 Datasets

We use the similar database in our research [8, 16, 18]. Text corpus for distributed word representations: To create distributed word representations, we use the dataset consisting of 7.3 GB of text from 2 million articles collected via the Vietnamese news portal. The text is first normalized to lower case. All special characters are removed except these common symbols: the comma, the semi-colon, the colon, the full stop and the percentage sign. All numeral sequences are replaced with the special token <number>, so those correlations between a certain word and a number are correctly recognized by the neural network or the log-bilinear regression model.

Each word in the Vietnamese language may consist of more than one syllable with spaces in between, which could be regarded as multiple words by the unsupervised models. Hence it is necessary to replace the spaces within each word with underscores to create full word tokens. The tokenization process follows the method described in [17]. After removal of special characters and tokenization, the articles add up to 969 million word tokens, spanning a vocabulary of 1.5 million unique tokens. We train the unsupervised models with the full vocabulary to obtain the representation vectors, and then prune the collection of word vectors to the 5.000 most frequent words, excluding special symbols and the token <number> representing numeral sequences.

Dependency treebank. We conduct our experiments on the Vietnamese dependency treebank dataset. This treebank is derived automatically from the constituency-based annotation of the VTB [18], containing 10.471 sentences (225.085 tokens). We manually check the correctness of the conversion on a subset of the converted corpus to come up 3.000 of universal dependency with a training set of 2.200 sentences, a test set of 400 sentences and a dev set of 400 sentences.

3.2 Feature Sets

Feature sets in transition-based: For each parser configuration c = (...|s2|s1|s0, b0|..., T) and transition f(c) in the gold parse. φ(c) is the feature vector representation if the parser configuration c. We denoted part-of-speech tags of token w is p(w). We use the notation tk(w) and e(w) to denote the extracting the word and the distributed representation of the word of token w. rm(w) and lm(w) corresponding to the right-most and left-most modifier of token w. We used the feature templates for the classifier in Table 4. Each feature vtk (w) = p(w)◦tk(w) or ve = p(w)◦e(w) is a feature template of token w.

Table 4 Feature sets for use in the transition classifier 

Feature set Feature templates
φ0 vtk(s0), vtk(s1), vtk(s2), vtk(b0)
φ1 ve(s0), ve(s1), ve(s2), ve(b0)
φ2 φ0, vtk(rm(s0)), vtk(lm(s0)), vtk(rm(s1)), vtk(lm(s1)), vtk(rm(s2)), vtk(lm(s2)), vtk(lm(b0))
φ3 φ1, ve(rm(s0)), ve(lm(s0)), ve(rm(s1)), ve(lm(s1)), ve(rm(s2)), ve(lm(s2)), ve(lm(b0))

Feature sets in graph-based: The feature-set proposed by McDonald et al. (2005) with 18 templates for a first-order parser, while the first order feature extractor in the actual implementation’s code (MSTParser2) includes roughly a hundred feature templates. In this case, feature extractor uses merely encoding of the headword and the modifier word with pos and word.

3.3 Vietnamese Dependency Parsing Based-on Bist-Parser

The Bist-parser is a tool, using BiLSTM feature extractors with graph-based and transition-based dependency parsers. This tool was developed by Kiperwasser et al., using BiLSTM feature extractors in Section 2.2.

We use two attachment scores, labeled atta-chment score (LAS) and unlabelled attachment score (UAS) to evaluate the accuracy of the dependency parsing system. Attachment scores are defined as the percentage of correct dependency relations recovered by the parser. A dependency relation is considered correct if both the source word and the target word are correct (UAS), plus the dependency type is correct (LAS).

We also estimate on the Vietnamese dependency treebank [18]. The result is the highest accuracy in Vietnamese dependency parsing as presenting in Table 6.

Table 5 Accuracy of Bist-parser with feature sets on the Vietnamese universal dependency treebank 

Feature set System Test
USA LSA
φ2 Transition-based 76.86% 72.38%
Graph-based 77.79% 74.08%
φ3 Transition-based 75.75% 71.13%
Graph-based 78.17% 74.84%
Phuong et al. [8] Transition-based 73.21% 63.06%
Luong et al. [16] Graph-based 73.09% 68.32%

Table 6 Accuracy of Bist-parser with feature sets on Vietnamese dependency treebank [18

Feature set System Test
USA LSA
φ2 Transition-based 82.77% 76.02%
Graph-based 84.05% 78.35%
φ3 Transition-based 83.17% 76.70%
Graph-based 84.45% 78.56%
Luong et al. [18] Transition-based 73.03% 66.35%
Some results on the other dependency banks in Vietnamese
Kiem-Hieu [15] Graph-based 84.4% 81.4%
Dat Quoc et al. [14] Graph-based (MSTParser) 79.08% 71.66%
Dat Quoc et al. [13] Graph-based (Neural network) 80.66% 73.53%

4 Conclusion

In this paper, we presented in detail to contribute Vietnamese universal dependency. We also use this data in the Bist-parser system which is based on bidirectional LSTMs for dependency parser. We evaluated the accuracy of the system for Vietnamese parsing in two cases: with or without using the distributed word representations feature in the Bist-parser system.

The accuracy of our system is UAS=78.17% and LAS= 74.84% when we use gloVe model for producing distributed word representations on Vietnamese universal dependency. This result is the highest accuracy in comparison with the previous researches. It increases about 5.0%, with details increasing from 73.21% to 78.17% and from 68.32% to 74.84% for USA and LSA respectively. This system gets state of the art performance on Viettreebank [18] with UAS=84.45% and LAS=78.56%.

In the future, we will integrate the CRF into this system. We also conduct another approach to apply this model to a constituency-based structure in Vietnamese.

References

1.  Chen, D., & Manning, C. D. (2014). A fast and accurate dependency parser using neural networks. Moschitti, A., Pang, B., & Daelemans, W., editors, EMNLP, ACL, pp. 740-750. [ Links ]

2.  Dozat, T., & Manning, C. D. (2016). Deep biaffine attention for neural dependency parsing. CoRR, Vol. abs/1611.01734. [ Links ]

3.  Dyer, C., Ballesteros, M., Ling, W., Matthews, A., & Smith, N. A. (2015). Transition-based dependency parsing with stack long short-term memory. CoRR, Vol. abs/1505.08075. [ Links ]

4.  Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Comput., Vol. 9, No. 8, pp. 1735-1780. [ Links ]

5.  Kiperwasser, E., & Goldberg, Y. (2016). Simple and accurate dependency parsing using bidirectional lstm feature representations. CoRR, Vol. abs/1603.04351. [ Links ]

6.  Le-Hong, P., Nguyen, T. M. H., & Azim, R. (2012). Vietnamese parsing with an automatically extracted tree-adjoining grammar. Proceedings of the IEEE International Conference in Computer Science: Research, Innovation and Vision of the Future, RIVF, HCMC, Vietnam. [ Links ]

7.  Le-Hong, P., Nguyen, T. M. H., Nguyen, P. T., & Roussanaly, A. (2010). Automated extraction of tree adjoining grammars from a treebank for Vietnamese. Proceedings of The Tenth International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+10), Yale University, New Haven, CT, USA. [ Links ]

8.  Le-Hong, P., Nguyen, T.-M.-H., Nguyen, T.-L., & Ha, M.-L. (2015). Fast Dependency Parsing Using Distributed Word Representations. Springer International Publishing, Cham, pp. 261-272. [ Links ]

9.  Marneffe, M.-C. D., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J., & Manning, C. D. (2014). Universal stanford dependencies: a cross-linguistic typology. Chair), N. C. C., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., & Piperidis, S., editors, Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), European Language Resources Association (ELRA), Reykjavik, Iceland. [ Links ]

10.  McDonald, R., Crammer, K., & Pereira, F. (2005). Online large-margin training of dependency parsers. Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), pp. 91-98. [ Links ]

11.  McDonald, R. T., & Nivre, J. (2011). Analyzing and integrating dependency parsers. Computational Linguistics, Vol. 37, No. 1, pp. 197-230. [ Links ]

12.  Minh, N. L., Đi࿇p, H. T., & K྿, T. M. (2008). Nghiên cứu luྭt hi࿇u chỉnh kết quྣ dùng phương pháp MST phân tích cú pháp ph࿥ thuộc ti྿ng vi࿇t. ICT-rda 8, Hanoi, Vietnam, pp. 258-267. [ Links ]

13.  Nguyen, D. Q., Dras, M., & Johnson, M. (2016). An empirical study for vietnamese dependency parsing. Proceedings of the Australasian Language Technology Association Workshop 2016, Melbourne, Australia, pp. 143-149. [ Links ]

14.  Nguyen, D. Q., Nguyen, D. Q., Pham, S. B., Nguyen, P.-T., & Nguyen, M. L. (2014). From Treebank Conversion to Automatic Dependency Parsing for Vietnamese. Proceedings of 19th International Conference on Application of Natural Language to Information Systems, pp. 196-207. [ Links ]

15.  Nguyen, K.-H. (2017). Bktreebank: Building a vietnamese dependency treebank. CoRR, Vol. abs/1710.05519. [ Links ]

16.  Nguyen, T.-L., Ha, M.-L., Le-Hong, P., & Nguyen, T.-M.-H. (2016). Using distributed word representations in graph-based dependency parsing for Vietnamese. pp. 804-810. [ Links ]

17.  Phuong, L. e., Thi Minh Huyen, N., Roussanaly, A., & Vinh, H. T. (2008). A Hybrid Approach to Word Segmentation of Vietnamese Texts. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 240-249. [ Links ]

18.  T.L., N., M.L., H., V.H., N., T.M.H., N., & P, L.-H. (2013). Building a treebank for vietnamese dependency parsing. International Conference on Computing and Communication Technologies, Research, Innovation, and Vision for the Future, RIVF 2013, Hanoi, Vietnam, November 10-13, 2013, IEEE, pp. 147-151. [ Links ]

Received: January 20, 2018; Accepted: March 05, 2018

Creative Commons License This is an open-access article distributed under the terms of the Creative Commons Attribution License