Press release: Dstl adds to open source software

Discussion in 'MoD News' started by MoD_RSS, Sep 28, 2015.

Welcome to the Navy Net aka Rum Ration

The UK's largest and busiest UNofficial RN website.

The heart of the site is the forum area, including:

  1. The entity extraction framework – known as Baleen – can automatically extract information from unstructured and semi-structured text. It tries to identify and extract entities from the text, such as people, locations, organisations and dates.

    Baleen has been under development for a number of years; Dstl is now seeking community contributions that can feed back into the software, improving the quality of the code and extending its capability.

    There are similar projects already in the public domain, but this provides an end-to-end entity extraction capability based on Apache UIMA (Unstructured Information Management Architecture) which is becoming a widely used and accepted framework.

    Dstl’s James Baker says he hopes the text analytics developer community will help develop the software further:


    We are releasing the core framework and a number of components of Baleen onto Github.com for the community to use, adapt and improve. We hope suppliers, members of academia and individuals will help take this further and develop capabilities which we have not yet uncovered, as well as find a use for it in their own work.

    Dstl’s code can be found on the Github site. For further information on the technical side, email: [email protected].

    Dstl Media Enquiries


    Email [email protected]

    Media enquiries 01980 658666

    Out of hours 07901 892660

    Continue reading...
     

Share This Page