با سلام خدمت کاربران در صورتی که با خطای سیستم پرداخت بانکی مواجه شدید از طریق کارت به کارت (6037997535328901 بانک ملی ناصر خنجری ) مقاله خود را دریافت کنید (تا مشکل رفع گردد).
ردیف | عنوان | نوع |
---|---|---|
1 |
Language models and fusion for authorship attribution
مدل های زبان و همجوشی برای انتساب نویسندگی-2019 We deal with the task of authorship attribution, i.e. identifying the author of an unknown
document, proposing the use of Part Of Speech (POS) tags as features for language modeling. The
experimentation is carried out on corpora untypical for the task, i.e., with documents edited by
non-professional writers, such as movie reviews or tweets. The former corpus is homogeneous
with respect to the topic making the task more challenging, The latter corpus, puts language
models into a framework of a continuously and fast evolving language, unique and noisy writing
style, and limited length of social media messages. While we find that language models based on
POS tags are competitive in only one of the corpora (movie reviews), they generally provide
efficiency benefits and robustness against data sparsity. Furthermore, we experiment with model
fusion, where language models based on different modalities are combined. By linearly combining
three language models, based on characters, words, and POS trigrams, respectively, we
achieve the best generalization accuracy of 96% on movie reviews, while the combination of
language models based on characters and POS trigrams provides 54% accuracy on the Twitter
corpus. In fusion, POS language models are proven essential effective components. Keywords: Authorship attribution | Language models | Computational linguistics | Text classification | Machine learning |
مقاله انگلیسی |