Term Extraction: Pros and Cons

By Bekir Diri

As a Language Service Provider with term extraction capabilities, you can provide more efficient service to your customers. You can create comprehensive multilingual glossaries before initiating the translation for a high volume project –as there are too many customer specific terms that have to be used in translation- enable translators to use the same terminology during translation, and correct & review your glossary after translation.

Translation is like a 100 meters run, you have to race against time to see the finish line. But how can we efficiently manage a project that has a 30K words in a short while? Texts have unique and specific terminologies. For that matter, using a term extraction / mining tool may offer drastic changes in delivery and accuracy of the translation. You may also have to deal with term changes in your current translation. That’s why those terminological updates or changes make terminology extraction vital for accuracy and precision in translation. Currently there are two possible options to extract terminology: free or commercial tools/APIs. These tools are machine-aided but the final decision is always human-based.

As you might imagine, no translation tools / software is perfect; and term extraction tools are no exception. Let’s take a look at some of them with their pros and cons.

Pros

Cost-free term extraction tools! No installation!

Online term extraction tools have a really simple interface but do not be suspicious about their efficiency. Get your terms with just one click! Moreover, this service is free of charge. Last but not the least, no installation is required on your computer, they are on the cloud (web-based)! So you may use it in every platform.

Online free tools like TerMine have great capabilities. For example; they can make a list of compound nouns, term candidates, etc. Also, APIs like Translated.net Labs Terminology Extraction uses colorful links to identify your terms!

Happy customers, Long-Lived companies

As a Language Service Provider with term extraction capabilities, you can provide more efficient service to your customers. You can create comprehensive multilingual glossaries before initiating the translation for a high volume project –as there are too many customer specific terms that have to be used in translation- enable translators to use the same terminology during translation, and correct & review your glossary after translation.

With free term extraction tools, you can provide tailor-made solutions for your customers. Once your translation company sets customer-specific terminologies, you can expect even more customers looking for consistency and precision.

Cons

“Noise” and “Silence” factors

Some term extraction tools have a statistical approach to text, so they just look for textual repetitions. Disadvantages of that approach are invalid term candidates, known as “noise” and unidentified words known as “silence”. Therefore, translators, editors or LSPs who use these statistical tools have to “clean-up” their terminology after extraction process.

So keep “noise” and “silence” down!

Like Italians said: Concordanza è importante (Concordance is important)!

Term extraction tools using linguistic approach seek for “noun + noun” or “adjective + noun” structures. These tools with linguistic approach seem to be designed for languages in the same families because at their core they work monolingually – by analyzing a text or corpus in order to identify candidate terms to add as terminology. Therefore, if you are trying to extract a terminology with languages from two different language families, you might misplace specific words and end up with erroneous terms in your translation.