Enterprise Ireland/Commercially Sponsored Internship in Semantic Uplift
A critical source of real-word business intelligence is product data-sheets published on the web in HTML or PDF formats. The sponsor company needs new techniques to harvest these specifications (as semi-structured data) and to convert them into a structured knowledge model that can, if necessary, be re-published on the web either as directed advertising or for consumption by linked data mash-up applications.
This will involve scripting/programming tools to aid HTML or PDF conversion to RDF, creating a domain knowledge model in RDF and building an example website that fuses harvested data for browsing and presents reports via SPARQL queries.
A self-starting attitude, strong programming skills and familiarity with web technology are required, RDF or linked data experience is a bonus.