Linked data (LD) has been identified as a key technology to realize the vision of a truly multilingual digital single market in Europe. Such a technology is in a mature state now and has been increasingly adopted by industry and public institutions worldwide (e.g., national libraries, museums, media companies, and public administrations, among others). In this context, linguistic linked data (LLD) appeared as an emergent trend within the LD field to share and interlink linguistically relevant data sources. The benefits of sharing linguistic data on the Web in a semantically interoperable manner has been recognized by the language technologies community, which has shown increasing interest in publishing linguistic data and metadata as LLD on the Web. However, using such technologies comes at a price (in terms of learning curve, need of technical support, etc.) that has prevented their adoption and use on a larger scale.
In this talk, they briefly review the current status of the LLD field and analyse some of their current challenges (related to sustainability issues, entry barriers to the technology, etc.). Then, a roadmap is proposed to address such challenges in order to attain an ecosystem of truly interoperable linguistic data on the Web, multilingual in nature, across different linguistic levels. Its potential role as a complementary technique to current large language models (LLM) will also be briefly discussed. Such a roadmap is one of the outcomes of NexusLinguarum, the "European network for Web-centred linguistic data science" COST Action".
- Learn more about Linguistic Linked Data: open challenges and a roadmap