英语介词语言功能辨别与翻译之应用

论文价格:0元/篇 论文用途:仅供参考 编辑:论文网 点击次数:0
论文字数:**** 论文编号:lw202313340 日期:2023-07-16 来源:论文网

1 Introduction


1.1Research Background
English prepositional phrases (PPs) are small in number, but they play an important rolein English. The survey of English preposition usage in some predominant dictionaries showsthat the number of the most frequently used preposition is less than one hundred, but they arestatistically frequent, functionally versatile and positionally flexible. There is a preposition inevery eight words in written text. Carter and McCarthy^i] listed the 50 most commonly usedwords from CIC (Cambridge International Corpus) written data and spoken data, and foundout that prepositions dominated the list. John TayIor[2] pointed out that English is apreposition-focus language because prepositions can express plentiful information. Thus,it isworth our effort to study PPs.A PP can be formed by words from different word classes in a complicated structure.Besides,PPs can serve many functions in a sentence, such as post-modifier, adjunct and so on.Therefore, PPs are very likely to cause ambiguity in a sentence which can lower the quality ofmachine translation (MT). For example,We appreciate your patience for us.Google translation:对我们来说,我们非常感谢您的耐心等待。Correct translation:我们非常感谢您对我们的耐心等待This translation error is brought by the misunderstanding of the PP's syntactic functions.In this example, for us is considered as the adjunct in the Google translation system, whichcauses the error. In fact, for us is the post-modifier of your patience. This structure of“nominal group 1+preposition+nominal group 2" can easily cause ambiguity.Therefore, recognition of PP's syntactic functions, work of identifying PPs and theirsyntactic functions, is a basic research topic in natural language processing, which can beused in numerous fields, such as MT and information retrieval.However,the current study in MT is mainly about the binary PP attachment, that is,whether a PP is attached to the verb or the noun.
………………


1.2Research Objective
This study aims to explore the identification of PPs’ syntactic functions within businessdomain for the purpose of MT under the guidance of systemic functional grammar (SFG) andapply this recognition to PP translation. Halliday's SFG is a semantic-based grammar insteadof the traditional formal grammar commonly used in the current identification of PPs’syntactic functions. We chose business domain for two reasons. One is that there is anincreasing demand for business English translation as the economy is booming, and the otherone is that business English is suitable for MT due to its logical structures and accuratediction. The improvement of PPs’ identification with appropriate syntactic functions issignificant for the PP translation.
……………


2 Literature Review


2. 1 Review of English PP Study in Linguistics
In linguistics, extensive research has been done so far towards English PPs mainly fromfive perspectives: the philological perspective, the structural perspective, the cognitiveperspective, the generative perspective and the functional perspective.Most of the philologists categorize prepositions into the close set or the functional wordsbecause prepositions don't express any specific meaning and they cannot be used to finish asentence. Just these preliminary opinions on prepositions give rise to intense argumentsamong different linguistic schools.Structuralists have the same idea with philologists about the POS of prepositions. Theytake prepositions as functional words without any meaning. The difference between these twoschools lies in that structuralists regard prepositions as arbitrary signs[3]. Quirk:[4] definedprepositions as the relation between two entities, especially space and time. To be more exact,prepositions are even regarded as “connecting words”,a temporal relation between trajectoryand landmark[5], or “relative words” to connect elements in the sentence. There is no doubtthat English prepositions display various spatial relations between objects and entities.George Yule[8] also states that prepositions refer to words used along with nouns in phraseswhich furnish information about location,time and other relations involving actions andobjects. In a way, prepositions demonstrate some sort of relation that different objects orentities are involved in.
…………


2.2 Review of English PP Study in Machine Translation


2.2.1Machine Translation
As the first computer-based application related to natural language, MT refers to anycomputer-based process that transforms or facilitates a user to transform written text from onehuman language into another. It includes fully automated MT, human-assisted MT andmachine-aided translation. MT first appeared in the late 1940s when Warren Weaver putforward various proposals which led to the trend of researching into MT[叫.It was not until1970s that MT became to flourish with the representative of SYSTRAN,after which a widevariety of MT systems emerged. With the development of personal computers and the internet,the online translation services experienced a rapid growth, such as Google,Youdao and etc.With the arrival of the information age, knowledge is increasing in an explosive way andnumerous documents in various languages are produced by the information exchanges. Thus,MT study is of great importance in the modem society.
……………


3The Definition of PPs and Their Syntactic Functions........9
3.1The Definition of PPs........9
3.2Syntactic Functions of PPs........11
4The Corpus........13
4.1A Small Annotated English-Chinese Bilingual Corpus........ 13
4.2POS Tagging........13
4.2.1The Tagging Method........13
4.2.2The Tag Set........14
4.2.3The Tagging Process........16
5Identification of English PPs' Syntactic Functions........17
5.1An Automatic Identification System........17
5.2Construction of the Training Corpus........18
5.3Evaluation Metrics........30
5.4Experimental Results........30
5.4.1Results of the Close Test........30
5.4.2Results of Open Tests........31
5.5Error Analysis........33


6 Translation of Two English PPs


6.1Selection of OF-PPs and IN-PPs.
Since this translation is manually performed, it is hard to translate all PPs in the corpus.Therefore,we chose those ones with high frequency. The distribution of PPs in the corpus isshown in Tab. 6.1. OF-PPs have the highest occurrence in the corpus, followed by IN-PPs. The total amountof PPs is 22,240 in the corpus, and these two PPs occupy 37.05%, more than third of all PPs.Thus, in this study, the translation of the top two PPs is explored. Firstly, all the PPs for thesetwo prepositions are found. Then, they are categorized by their syntactic functions. Finally,the translation templates under each function are summarized. Through this way, we get thetranslation templates of OF-PP and IN-PP and also compare the difference of the translationtemplates among the different functions.
…………


Conclusion


This thesis studied identification of English PPs’ syntactic functions for MT in businessdomain under the guidance of Halliday's SFG. Different from the previous study in MTwhich focused on the binary PP attachment, this research refined the function chunks of PPsinto four types: AD, C, PM, and POP, for the purpose of MT. Meanwhile, this studycombined the knowledge of POS to improve identification of these chunk types. Therefore,four major changes are made in our new POS tag set on the basis of the Penn Treebank tag set.Based on this new tag set, a small annotated English-Chinese corpus in business domain wassemi-automatically built. An identification system using CRFs was presented to automaticallyrecognize PPs' syntactic functions. Experiments were carried out to test the effectiveness ofour POS tag set. In addition, identification of PPs was applied to PP translation.Experiments showed that our system using our new tag set achieved a precision of88.45%, a recall of 88.60% and an F-score of 88.50%, which significantly outperformed theexperiment using Penn Tree bank tag set, with 5.06%, 5.27% and 5.14% increase in theprecision, recall and F-score respectively. Our new function chunk POP acquired the bestperformance with the precision of 94.90%, followed by the function chunk C with 91.99%precision. These results indicated that our system, combining the knowledge of our new tagset was a better approach to PP chunking for the purpose of English-Chinese MT.
……………
Reference (omitted)


如果您有论文相关需求,可以通过下面的方式联系我们
客服微信:371975100
QQ 909091757 微信 371975100