A Glimpse of the Content of the Corpus of Contemporary Irish Gaois.ie , Corpus , Linguistics , Language Resources MARCH This blog post contains both a description and a tour of the Corpus na Gaeilge Comhwheide (CGC) which provides an overview of the contents of the corpus and the ways in which it can be search. CGC currently contains . million words, and continues to grow as new content is add from the publications and publishers who have generously shar their work with us. CGC has been compil since ; don’t miss the first blog post about him here , when it was only million words.
The public cannot draw
It was a bare text corpus until a year ago, when Dubai Email List the Gaois research team start tagging the material. That tagging is further describ below. In the Ríordánách publish his collection of poetry, Brosna and in his poem ‘Múscail do Misneach’ said: “Wealth I promise you if you are an earl, The movement of the sea and the stop of the hills.” This quote ran with me because riches are promis to anyone who searches the . million words in CGC, where there are almost , unique text symbols . (Textmark is the word form, as it appears in the corpus, but the term also includes non-words.
The search yet because
The corpus currently contains literature, news, academic magazines, pop magazines, and columns; where the movement of the sea will be found in the newly invent BS Leads words, and the stop of the hills in the rich heritage and traditional Irish that has been acquir since this generation. Each word is tagg with its part of speech, lemma, and morphological details of the word. This processing and tagging enables the linguistic and linguistic analysis carri out by the Gaois research team, and is also essential for lexical research.