Maltese-English comparable corpus
- A collection of news text of just over 7 months gathered between September 2012 and April 2013
- Contains 7,284,804 English words and 3,045,238 Maltese words
- Corpus is in vertical format and is part-of-speech tagged using the POS tagger at mlrs.research.um.edu.mt
- English text taken from www.timesofmalta.com
- Maltese text taken from www.inewsmalta.com and www.maltarightnow.com
Maltese-English dictionary
- As part of the METANET4U, the University of Malta converted Grazio Falzon’s dictionary into TEI XML to make it more machine-readable.
- A new version is now available which supports Maltese diacritics (which are ċ, ġ, ħ, ż). Kindly contact me for further information.
Basic Stemmer for Maltese
- Written using Snowball
Download Snowball source (see here for compilation instructions)
Download Java version (run java -jar MalteseStemmer.jar <word to stem> to run stemmer e.g. java -jar MalteseStemmer.jar darbtejn returns darba)