Making Your Data Lake More Smart and Usable – or “Is Your Data Lake Full of Italian Dressing?”
Share this Session:
  Dan Carey   Dan Carey
Ontologist
Semantic Arts
 


 

Tuesday, September 20, 2016
03:00 PM - 03:45 PM

Level:  Intermediate


Data Lakes attempt to avoid the high costs and brittle results of building Data Warehouses by delaying making the data interoperable. But their lack of structure makes more difficult both finding the correct data and analyzing it correctly. We describe how using Semantic Technology standards and tools supports simple (but not simplistic) relationships that document and reflect the meaning of the data, making the Lake smarter. At the same time, these standards and tools support accessing and querying the various data sources in the Lake as a coherent, interoperable whole. All of which improves the turn-around and costs for analytics product development. Participants will learn:
  • The basic data structure (RDF triples) behind all Semantic Technology
  • How triples may be derived from existing structured & unstructured data
  • How triples can be self-assembled into rich, complex graphs (and what “graphs” means)
  • How semantic schemas (ontologies) add still more “smartness” to the data
  • How tools using the R2RML standard transform a Data Lake’s parts into a whole (without necessarily changing the original datastore)


  • Dan Carey is an ontologist and data architect with 30 years of consulting experience, 25 of it designing databases, data models, and data strategies with major IT service firms. His current work involves using semantic technology to make client's systems inter-operable in a data-centric manner. Previously, he has primarily supported government clients at the federal, state, and local levels, and has designed semantic technology products and data exchange standards to assist in military human resources management.


       
    Close Window