The Challenge: The disruptive power of the World Wide Web is built on the simple ability of organisations worldwide to independently publish content that uses URL-based hyperlinks to reference and attribute content published by others. Recent web standards now apply this directly to the publication of data on the web, by using extensible data schema, URLs for individual data items and open query APIs.
Localization is a Big Data industry that is poised to be transformed by open data on the web enabled by these new standards. In localisation, words are our data. Collections of terms and translations are commercially traded along value chains as part of localization projects. The leverage of terms and translations between projects is now being amplified by the uptake of data-driven language technologies applied to machine translation and automated term extraction.
Goals: The FALCON project combines the power of open data on the web with data-driven language technologies to construct the Localization Web. This consists of a network of terms and translations inter-linked to each other and to source and target documents via URLs. FALCON will integrate the resulting web of linked localisation and language data into localisation tool chains using existing data query and access control standards. Meta-data from these tools will add value to these data assets, enabling seamless quality monitoring across the value chain and their on-demand leverage in training machine translation and text analytics engines.