ÿþ<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>IntuView Ltd.</title> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /> <link href="css/style.css" rel="stylesheet" type="text/css" /> <link href="css/dropdown/dropdown.css" media="all" rel="stylesheet" type="text/css" /> <link href="css/dropdown/dropdown.vertical.css" media="all" rel="stylesheet" type="text/css" /> <link href="css/dropdown/themes/default/default.ultimate.css" media="all" rel="stylesheet" type="text/css" /> </head> <body> <div id="wrapper"> <div id="wrapperi"> <div id="wrapperj"> <h1 id="header"><a href="default.aspx"><img src="images/int.jpg" alt="IntuView Ltd." /></a></h1> <div id="left1"> <br /><br /> <ul id="nav" class="dropdown dropdown-vertical"> <li><span class="dir">About Us</span> <ul> <li><a href="profile.html"><b>Company Overview</b></a></li> <li><a href="team.html"><b>Management Team</b></a></li> <li><a href="partners.html"><b>Our Partners</b></a></li> </ul> </li> <li><span class="dir">Products & Solutions</span> <ul> <li><a href="products.html"><b>Overview</b></a></li> <li><a href="intuscan-platform.html"><b>IntuScan"! Platform</b></a></li> <li><a href="ivstation.html"><b>IntuScan"! Station</b></a></li> <li><a href="bloginspector.html" class="blue"><b>IntuScan"! Blog Inspector</b></a></li> <li><a href="crawler.html" class="blue"><b>IntuScan"! Smart Crawler</b></a></li> <li><a href="intuscan-Name Matcher.html"><b>IntuScan"! Entity Matcher</b></a></li> <li><a href="IED.html"><b>IntuScan"! IED Recipe Identifier</b></a></li> <li><a href="restricted/content.aspx"><b>Knowledge Base Packages</b></a></li> </ul> </li> <li><span class="dir">Services</span> <ul> <li><a href="restricted/services.aspx"><b>Customizations</b></a></li> <li><a href="restricted/training.aspx"><b>Training</b></a></li> <li><a href="download.html"><b>Download Center</b></a></li> </ul> </li> <li><a href="news.html"><b>News</b></a></li> <li><a href="careers.html"><b>Careers</b></a></li> <li><a href="contact.html"><b>Contact Us</b></a></li> </ul> <div class="clear"></div> <div class="clear"></div> <br /><br /> <font color="#006400" size="2"><b>Related Information</b></font> <div id="extensions"> <br /> <div id="bullets"> <ul> <li><a href="restricted/technology.aspx"> The IntuScan"! Technology</a></li> <li><a href="eval.aspx"> Request Evaluation</a></li> </ul> </div> <br /> <b>Technical Information</b> <table id="sidetable" > <tr> <td > <table> <tr> <td><b>Supported Formats:</b></td> </tr> <tr> <td>textual pdf, htm, html, mht, doc, txt, rtf, xls, pst</td> </tr> <tr> <td><b>Supported Languages:</b></td> </tr> <tr> <td>Arabic, Malay/Indonesian. <b>Planned to be supported soon - </b> Farsi, Urdu, Pashtu</td> </tr> <tr> <td><b>Environment:</b></td> </tr> <tr> <td><u>Standalone version:</u> Windows XP, Minimum 4Gb RAM, No need for external communication.</td> </tr> <tr> <td><u>Server version:</u> Windows Server 2003, Minimum 8Gb RAM, Multiple cores (as much as needed), No need for external communication.</td> </tr> <tr> <td>API is available for Java, C++ and .NET</td> </tr> </table> </td> </tr> </table> </div> <br /> </div> <div id="right1"> <div id="topnavigation"> <a href="default.aspx">Home </a> > IntuScan"! Platform </div> <div id="maincontenttitle"> IntuScan"! Entity Matcher </div> <div id="maincontent"> IntuView has developed unique technology for Named Entity Recognition and Entity Matching. This technology extracts entities mentioned in the document and determines whether they are: <br /><b>people, places, institutions or organizations, URLs and dates (Gregorian, Jewish, Farsi, Hijri) etc.</b> <br /> <br /><br /> <b>IntuScan"! Entity Matcher </b> identifies, disambiguates and matches names of persons, places, institutions, etc. from data sources, and links them to contextual information for further identification. Using algorithms based on knowledge of patterns and conventions of pertinent source languages and cultures, it recognizes possible name variants, matches names from different inputs and sources and extracts information from the contexts in which names appear, creating a virtual  identity card of the person behind that name: ethnic origin, gender, family/tribal links etc. The information gleaned from different occurrences of the entity  even under variant names - is aggregated into an comprehensive database entry of the entity, including: possible affiliations, name variants, ethnic origin, gender, family/tribal links etc. This "ad-hoc" entity is added to the database for further reference. <br /><br />For example, a person who is mentioned in a set of documents in different forms as follows: <br /> <div id="box"> <div id="bullets-narrow"> <ul> <li> Mujahid Sheikh Ahmad Yousuf Muhammad</li> <li> Sheikh Abu Yousuf , Leader of the Islamic Army </li> <li> Brother Mujahid Abu Yousuf Ahmad, MAY ALLAH PRESERVE HIM <br /> <font size="1">(document date 1.2.2009)</font> </li> <li> Our Leader, Sheikh Ahmad Yousuf Muhammad, MAY ALLAH HAVE COMPASSION ON HIM <br /><font size="1">(document date 1.4.2009, document identified as belonging to the Islamic Army)</font> </li> </ul> </div> Will all create the entity: <b>Ahmad Yousuf Muhammad</b>, described in the texts as Mujahid, Sheikh, Brother (in reference to the Islamic Army), Leader of the Islamic Army. His ideological orientation is jihadi-salafi and he is associated with the Islamic Army. He appears to have been alive on 1.2.2009 but deceased before 1.4.2009. <br /><br /><br /> </div> <b>Automatic Transliteration</b><br /> Automatic transliteration of named entities in Arabic and other relevant languages, is another challenging task. The fact that Arabic names are written without diacritics, makes this task even more difficult. We use sophisticated rule based algorithm for dealing with different complicated cases. IntuScan"! supports standard transliteration systems:<br /> <b>IC, BGN, SATTS</b>. <br /><br /><br />The IntuScan"! Entity Matcher includes the following modules: - <div id="bullets"> <ul> <li><b>Recognition of named entities </b> of persons in unstructured text -Named Entity Recognizer (NER).</li> <li><b>Cultural-linguistic sensitive analysis </b>of the names and identification of their components by the Named Entity Analyzer (NEA). This component validates the name through implicit information hidden in the name and identification of anomalies between name parts</li> <li><b>Validation of transliterated names</b> by restoring them to the source language - Named Entity Transliterator (NET).</li> <li><b>Aggregation of all the name variants </b>and aliases referring to an entity - Named Entity Combiner (NEC).</li> <li><b>Parsing of the name </b> and analysis of its components to identify all possible name variants (by the NEA - Named Entity Analyzer) to extract information implicit in the names (gender, religion, ethnicity, etc.) </li> <li>Identification of <b>relations </b> between identified entities (siblings, father-son, grandfather, etc.) - Named Entity Matcher (NEM).</li> </ul> </div> <br />The <b> IntuScan"! Entity Matcher</b> can also be applied to <b>databases of names</b> such as <b>watch lists </b>in order to validate the content of the databases, merge duplicates and point out potentially invalid entries which impair the integrity of the data base and render it cumbersome and ineffective. <br /><br /><br /> The <b> IntuScan"! Entity Matcher</b> operates on such lists as follows: - <div id="bullets"> <ul> <li><b>Re-transliteration of names</b> written in Latin script into the assumed source-language (in this case, Arabic script in a canonic transliterated form).</li> <li><b>Extraction</b> of all names from the unstructured Arabic text.</li> <li><b>Parsing of each name</b> (in both the watch list and the unstructured corpus) to identify its constituent parts.</li> <li><b>Generation of variants</b> of the parsed name.</li> <li><b>Validation of the names</b> according to algorithms based on the source-language by examining the names for possible anomalies, inconsistencies, incompleteness or other errors and output of all names which are found invalid. </li> <li><b>Creation of a new data base</b> name entity with all the generated and previous data on that named entity.</li> <li><b>Extraction of information</b> implicit in the names, (gender, ethnicity, name variants).</li> <li><b>Reduction of the list of names</b> by matching duplicates which refer to the same person entity.</li> <li>Identification of <b>possible relatives </b>, tribal compatriots etc. </li> <li><b>Output of the data </b> including: name of individual; source language form; all information associated with the entities; list of invalid names and the reason for invalidation; family links between the names.</li> </ul> </div> <div class="clear"></div> </div> </div> <div class="clear"></div> <div id="footer"><div id="footeri"> <span class="copyright">© 2009 IntuView Ltd.</span> <a href="default.aspx">Home</a> &nbsp; <a href="restricted/services.aspx">Services</a> &nbsp; <a href="products.html">Products</a> &nbsp; <a href="contact.html">Contact</a> &nbsp; </div></div> </div> </div> </div> <script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-9565407-1"); pageTracker._trackPageview(); } catch(err) {} </script> </body> </html>