Share Email Print

Proceedings Paper

Human-machine interaction to disambiguate entities in unstructured text and structured datasets
Author(s): Kevin Ward; Jack Davenport
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Creating entity network graphs is a manual, time consuming process for an intelligence analyst. Beyond the traditional big data problems of information overload, individuals are often referred to by multiple names and shifting titles as they advance in their organizations over time which quickly makes simple string or phonetic alignment methods for entities insufficient. Conversely, automated methods for relationship extraction and entity disambiguation typically produce questionable results with no way for users to vet results, correct mistakes or influence the algorithm’s future results. We present an entity disambiguation tool, DRADIS, which aims to bridge the gap between human-centric and machinecentric methods. DRADIS automatically extracts entities from multi-source datasets and models them as a complex set of attributes and relationships. Entities are disambiguated across the corpus using a hierarchical model executed in Spark allowing it to scale to operational sized data. Resolution results are presented to the analyst complete with sourcing information for each mention and relationship allowing analysts to quickly vet the correctness of results as well as correct mistakes. Corrected results are used by the system to refine the underlying model allowing analysts to optimize the general model to better deal with their operational data. Providing analysts with the ability to validate and correct the model to produce a system they can trust enables them to better focus their time on producing higher quality analysis products.

Paper Details

Date Published: 3 May 2017
PDF: 7 pages
Proc. SPIE 10207, Next-Generation Analyst V, 102070I (3 May 2017); doi: 10.1117/12.2265825
Show Author Affiliations
Kevin Ward, Decisive Analytics Corp. (United States)
Jack Davenport, Decisive Analytics Corp. (United States)

Published in SPIE Proceedings Vol. 10207:
Next-Generation Analyst V
Timothy P. Hanratty; James Llinas, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?