Difference between revisions of "Similarity search mod"

From TNG_Wiki
Jump to navigation Jump to search
(22 intermediate revisions by 3 users not shown)
Line 3: Line 3:
 
| mod_name        = Similarity search Mod
 
| mod_name        = Similarity search Mod
 
| mod_summary    = Adds similarity search on person/place names
 
| mod_summary    = Adds similarity search on person/place names
 +
| mod_validation  =
 +
| mod_last_update = 28 Sep 2016
 
| download_link  =
 
| download_link  =
[[Media:similarity_search_mod_v10.0.0.0.zip|similarity_search_mod_v10.0.0.0.zip]] for {{Tv100}}<br /> [[Media:similarity_search_mod_v9.2.2.2.zip|similarity_search_mod_v9.2.2.2.zip]] for {{Tv92}}<br />  
+
[[Media:similarity-search-v11.0.2.0.zip|similarity-search-v11.0.2.0.zip]] for {{Tv120}}<br />
 +
[[Media:similarity-search-v11.0.2.0.zip|similarity-search-v11.0.2.0.zip]] for {{Tv1102}}<br />
 +
[[Media:similarity_search_mod_v11.0.0.0.zip|similarity_search_mod_v11.0.0.0.zip]] for {{Tv110}}<br />
 +
[[Media:similarity_search_mod_v10.1.2.1.zip|similarity_search_mod_v10.1.2.1.zip]] for {{Tv1012}}<br />  [[Media:similarity_search_mod_v10.1.2.0.zip|similarity_search_mod_v10.1.2.0.zip]] for {{Tv1012}}<br /> [[Media:similarity_search_mod_v10.1.0.0.zip|similarity_search_mod_v10.1.0.0.zip]] for {{Tv1010}}<br /> [[Media:similarity_search_mod_v10.0.0.2.zip|similarity_search_mod_v10.0.0.2.zip]] for {{Tv100}}<br /> [[Media:similarity_search_mod_v9.2.2.2.zip|similarity_search_mod_v9.2.2.2.zip]] for {{Tv92}}<br />  
 +
| download_stats =
 
| mod_author      = Rovian Veronez and Carlos A Heuser  
 
| mod_author      = Rovian Veronez and Carlos A Heuser  
 
| mod_url        = [[Similarity_search_mod|Similarity_search_Mod]] (This page)
 
| mod_url        = [[Similarity_search_mod|Similarity_search_Mod]] (This page)
 
| mod_support    = [[User:carheu|Carlos Heuser]]
 
| mod_support    = [[User:carheu|Carlos Heuser]]
 
| mod_contact    = [[User:carheu|Carlos Heuser]]
 
| mod_contact    = [[User:carheu|Carlos Heuser]]
| mod_version    = 9.2.2.2 for TNG V9.2.2<br />10.0.0.0 for TNG v10.0.0
+
| mod_version    = 9.2.2.2 for TNG V9.2.2<br />10.0.0.2 for TNG v10.0.0<br />10.1.2.1 for TNG v10.1.2<br />11.0.0.0 for TNG v11.0.0<br />11.0.2.0 for TNG v11.0.2<br />11.0.2.0 for TNG v12.0
 
| min_TNG_ver    = 9.2.2
 
| min_TNG_ver    = 9.2.2
| max_TNG_ver    = 10.0.0
+
| max_TNG_ver    = 12.0
 
| TNG_file_list  = genlib.php<br />js/selectutils.js<br />finditems.php<br />admin_setup.php<br />
 
| TNG_file_list  = genlib.php<br />js/selectutils.js<br />finditems.php<br />admin_setup.php<br />
 
languages/English/cust_text.php<br />languages/English-UTF8/cust_text.php<br />languages/PortugueseBR/cust_text.php<br />languages/PortugueseBR_UTF8/cust_text.php<br />languages/German/cust_text.php<br />languages/German_UTF8/cust_text.php<br />
 
languages/English/cust_text.php<br />languages/English-UTF8/cust_text.php<br />languages/PortugueseBR/cust_text.php<br />languages/PortugueseBR_UTF8/cust_text.php<br />languages/German/cust_text.php<br />languages/German_UTF8/cust_text.php<br />
 
languages/English/admintext.php<br />languages/English-UTF8/admintext.php<br />languages/PortugueseBR/admintext.php<br />languages/PortugueseBR_UTF8/admintext.php<br />languages/German/admintext.php<br />languages/German_UTF8/admintext.php<br />
 
languages/English/admintext.php<br />languages/English-UTF8/admintext.php<br />languages/PortugueseBR/admintext.php<br />languages/PortugueseBR_UTF8/admintext.php<br />languages/German/admintext.php<br />languages/German_UTF8/admintext.php<br />
| related_mods    = ...
+
| related_mods    =  
| notes          = ...
+
| notes          =  
 
}}
 
}}
 
{| style="margin-right:0.5 em;" align="right"  
 
{| style="margin-right:0.5 em;" align="right"  
 
| __TOC__
 
| __TOC__
 
|}
 
|}
 +
 +
== Warning - TNG database is modified ==
 +
 +
<span style="color: red">This Mod changes the TNG database</span>. Triggers are added to existing tables and new tables are created. Read the installation and removal instructions carefully before installing the Mod.
 +
As usual in cases like this, making a backup copy of your database previously to the Mod installation is highly recommended.
 +
If you don't feel  comfortable with this, please don't use this Mod.
 +
  
  
Line 46: Line 59:
 
== Using the Mod ==
 
== Using the Mod ==
  
To access the search form provided by this Mod you can:
+
The Mod creates a new menu entry at Find >> Word Search. By selecting this entry the user is taken to
* either select the "Search" link that appears at the top of most TNG pages,
+
the search form that is depicted below.
* or select Find >> Word Search (that is, go to the "Find" menu and select the "Word search" option -- this is the last entry in the menu and was inserted by the Mod).  
 
 
 
A search form containing the following options will be displayed:
 
  
* Search for people or for places
+
[[File:Sim-search-screen.jpg]]
* Search for one or two words that appear in the name (no distinction is made between name and surname)
 
* Search in a specific Gedcom or in the whole database
 
* Specify the similarity precision (Exact, Very precise, Precise or Inaccurate)
 
  
Results are displayed on-the-fly as the user enters some letters.
+
To perform a search enter one or two words of the name you are searching for in the field. Results are displayed on-the-fly as the user enters some letters.
 +
The result is presented in
 +
decreasing order of similarity.
  
If there are person names in events, they  may be also searched for. For example, I use one event
+
There are several options to control the search:
for a woman's married name and another for a person's alternative names.
 
  
The result is  
+
* There is a radio button to select between a person search and a place search.
presented in
+
* The Gedcom to be searched may be selected.
decreasing order of similarity. The page below shows an example of the result.
+
* There is a radio button that allows to specify the similarity precision (Exact, Very precise, Precise or Inaccurate).
  
[[File:search_form.jpg]]
+
By clicking on a person/place name the user is taken to the corresponding page.
  
 
== How it works ==
 
== How it works ==
Line 81: Line 89:
 
Everything happens in SQL meaning that response time is very short.  This approach has been tested in other applications (not TNG)
 
Everything happens in SQL meaning that response time is very short.  This approach has been tested in other applications (not TNG)
 
and gives good results even with very large databases (tens of millions of words).  
 
and gives good results even with very large databases (tens of millions of words).  
 
== Warning - TNG database is modified ==
 
 
<span style="color: red">This Mod changes the TNG database</span>. Triggers are added to existing tables and new tables are created.
 
As usual in cases like this, making a backup copy of your database previously to the Mod installation is highly recommended.
 
If you don't feel  comfortable with this, please don't use this Mod.
 
  
 
== Mod installation and database setup ==
 
== Mod installation and database setup ==
Line 96: Line 98:
 
* Download the zip file containing the Mod files and unzip it in the TNG mods folder.
 
* Download the zip file containing the Mod files and unzip it in the TNG mods folder.
 
* Install the Mod with the Manager.
 
* Install the Mod with the Manager.
 +
* Clear the browser cache (required, <span style="color: red">do not forget!</span>)
  
 
<span style="color: red">Note:</span> After the Mod is installed the database must be prepared for use with the Mod. If you are reinstalling the Mod, you do not need to prepare the database again.
 
<span style="color: red">Note:</span> After the Mod is installed the database must be prepared for use with the Mod. If you are reinstalling the Mod, you do not need to prepare the database again.
Line 116: Line 119:
 
Hit the Prepare button and the database will be modified for use by the Mod.
 
Hit the Prepare button and the database will be modified for use by the Mod.
  
After some time (depending on the size of your database), the page below will be displayed.  
+
After some time the page below will be displayed.  Notice that if the database is large this process make take several minutes.
 +
I have a database with about 12,000 individuals and 1,500 places. For this database the process takes between one and five minutes
 +
depending on the server being used.
  
  
Line 130: Line 135:
 
== Upgrade from previous versions ==
 
== Upgrade from previous versions ==
  
If you are upgrading from a previous version, <span style="color: red">do not prepare your database again</span>. There where no changes in the structure of tables since the first release of the mod. Simply remove the previous version using Mod manager and install the new one.
+
If you are upgrading from version 10.0.0.2 to a more current version, the database remains the same.
 +
 
 +
However, if you are upgrading from a version previous to 10.0.0.2 <span style="color: red">you will have to prepare your database again</span>. Follow this procedure:
 +
* Go to Setup in the TNG Administration page (Info >> Administration >> Setup).
 +
* At the Setup page, you will find the tab called Similarity Mod.
 +
* Click on this Tab. Click on the "Delete" button and confirm that you want to delete all tables.
 +
* After some time, a page will be displayed confirming that the tables have been deleted.
 +
 
 +
You may perform this steps either by using an old Mod version or using the more recent (10.0.0.2 or higher) version.
 +
 
 +
After deleting the tables, prepare the database again as described in the previous section.
  
 
== Mod removal ==
 
== Mod removal ==
Line 141: Line 156:
 
After that you may uninstall the Mod as usual by using the Mod Manager.
 
After that you may uninstall the Mod as usual by using the Mod Manager.
  
 +
<span style="color: red">'Be aware that if you don't follow the above steps</span>, triggers will be left in your database which will cause errors such as "Table 'tng.qgrams_grams' doesn't exist" should you subsequently delete the tables by other means. These can be hard to track down.
  
 
== Translations ==
 
== Translations ==
  
The Mod already contains translations to German and Brazilian Portuguese.
+
The Mod already contains translations to French, German and Brazilian Portuguese.
  
 
== Developer ==
 
== Developer ==
  
Mod developer is Rovian Veronez with changes by [[User:carheu|Carlos Heuser]].   
+
Mod developer is Rovian Veronez with changes by [[User:carheu|Carlos Heuser]].  French translation provided by André Morel
 
 
  
 
== Files changed ==
 
== Files changed ==
Line 188: Line 203:
 
! Date
 
! Date
 
! Contents
 
! Contents
 +
|-
 +
| v11.0.2.0
 +
| 28 Sep 2016
 +
|
 +
* Updated for TNG v11.0.2
 +
|-
 +
| v11.0.0.0
 +
| 17 Mar 2016
 +
|
 +
* Updated for TNG v11.0
 +
|-
 +
| v10.1.2.1
 +
| 13 Mar 2016
 +
|
 +
* French translation included (thanks to André Morel)
 +
|-
 +
| v10.1.2.0
 +
| 22 Jul 2015
 +
|
 +
* Tested for TNG v10.1.2
 +
* Minor error correction in css file: Mod was disabling the menu icon for sources
 +
|-
 +
| v10.1.0.0
 +
| 26 Jan 2015
 +
|
 +
* Updated for TNG V10.1
 +
|-
 +
| v10.0.0.2
 +
| 18 May 2014
 +
|
 +
* Previous versions allowed querying for max two words. In this version as many words as required may be entered as a query.
 +
* The name of the tree is displayed in the result.
 
|-
 
|-
 
| v10.0.0.1
 
| v10.0.0.1
| 11 Feb 2014
+
| 11 Mar 2014
 
|  
 
|  
 
* Database preparation works faster and uses less server memory.
 
* Database preparation works faster and uses less server memory.
Line 228: Line 275:
 
| [[User:carheu|Carlos Heuser]]
 
| [[User:carheu|Carlos Heuser]]
 
| Developers site
 
| Developers site
| v9.2.2.0
+
| 11.0.2.0 for TNG v12.0
 
| English, German, Brazilian Portuguese
 
| English, German, Brazilian Portuguese
 +
|-
 
|}
 
|}
  
[[Category:Mods for TNG v9]][[Category:Mods for TNG v10]]
+
[[Category:Mods for TNG v9]][[Category:Mods for TNG v10]][[Category:Mods for TNG v11]][[Category:Mods for TNG v12]]

Revision as of 16:26, 5 May 2018

Similarity search Mod
Summary Adds similarity search on person/place names
Validation
Mod Updated 28 Sep 2016
Download link similarity-search-v11.0.2.0.zip for
TNG 12.0


similarity-search-v11.0.2.0.zip for

TNG 11.0.2


similarity_search_mod_v11.0.0.0.zip for

TNG 11.0


similarity_search_mod_v10.1.2.1.zip for

TNG 10.1.2


similarity_search_mod_v10.1.2.0.zip for

TNG 10.1.2


similarity_search_mod_v10.1.0.0.zip for

TNG 10.1.0


similarity_search_mod_v10.0.0.2.zip for

TNG 10.0


similarity_search_mod_v9.2.2.2.zip for

TNG 9.2

Download stats
Author(s) Rovian Veronez and Carlos A Heuser
Homepage Similarity_search_Mod (This page)
Mod Support Carlos Heuser
Contact Developer Carlos Heuser
Latest Mod 9.2.2.2 for TNG V9.2.2
10.0.0.2 for TNG v10.0.0
10.1.2.1 for TNG v10.1.2
11.0.0.0 for TNG v11.0.0
11.0.2.0 for TNG v11.0.2
11.0.2.0 for TNG v12.0
Min TNG V 9.2.2
Max TNG V 12.0
Files modified
genlib.php
js/selectutils.js
finditems.php
admin_setup.php

languages/English/cust_text.php
languages/English-UTF8/cust_text.php
languages/PortugueseBR/cust_text.php
languages/PortugueseBR_UTF8/cust_text.php
languages/German/cust_text.php
languages/German_UTF8/cust_text.php

languages/English/admintext.php
languages/English-UTF8/admintext.php
languages/PortugueseBR/admintext.php
languages/PortugueseBR_UTF8/admintext.php
languages/German/admintext.php
languages/German_UTF8/admintext.php
Related Mods
Notes


Warning - TNG database is modified

This Mod changes the TNG database. Triggers are added to existing tables and new tables are created. Read the installation and removal instructions carefully before installing the Mod. As usual in cases like this, making a backup copy of your database previously to the Mod installation is highly recommended. If you don't feel comfortable with this, please don't use this Mod.


Purpose of this Mod

Names may change spelling over time. In my tree you will find the same name spelled as Heuser, Heiser, Heusser and Heusers. The same may happen to place names. For example, a place that was spelled Villa Izabella in the past, may be spelled Vila Isabela today.

Additionally, one can not expect that each TNG user searching for somebody in a tree knows the exact name spelling that was used by the author of that tree.

When the exact spelling of a name is unknown, similarity search may help.

TNG offers two alternatives of similarity search: Soundex and Metaphone. However, they are not always useful. Both are language dependent. Soundex reduces a word to just four characters and tends to output a large number of results. Metaphone tends to be very restrictive.

This Mod introduces a novel way of searching for persons or places. The user simply provides one or two words of the name of the person/place being searched and TNG will display a list of names that contain words similar to the query.


Using the Mod

The Mod creates a new menu entry at Find >> Word Search. By selecting this entry the user is taken to the search form that is depicted below.

Sim-search-screen.jpg

To perform a search enter one or two words of the name you are searching for in the field. Results are displayed on-the-fly as the user enters some letters. The result is presented in decreasing order of similarity.

There are several options to control the search:

  • There is a radio button to select between a person search and a place search.
  • The Gedcom to be searched may be selected.
  • There is a radio button that allows to specify the similarity precision (Exact, Very precise, Precise or Inaccurate).

By clicking on a person/place name the user is taken to the corresponding page.

How it works

Each word in a person/place name is divided in a set of "grams" of size three. For example, "Smith" results in "Smi", "mit", "ith". All this strings are stored in a database table, together with a pointer to the person/place it belongs to, as well as with the position of the gram in the name.

When a query is submitted to the Mod, each word in the query is divided in grams in the same way. The grams in the query are compared to those in the database. The mod will search for words that contain a similar set of grams at similar positions. Everything happens in SQL meaning that response time is very short. This approach has been tested in other applications (not TNG) and gives good results even with very large databases (tens of millions of words).

Mod installation and database setup

The Mod is installed as usual with the Mod Manager:

  • If you are using an older version of the Mod, uninstall it first.
  • Remove the Mod files from the TNG mods folder.
  • Download the zip file containing the Mod files and unzip it in the TNG mods folder.
  • Install the Mod with the Manager.
  • Clear the browser cache (required, do not forget!)

Note: After the Mod is installed the database must be prepared for use with the Mod. If you are reinstalling the Mod, you do not need to prepare the database again.

Go to Setup in the TNG Administration page (Info >> Administration >> Setup).

At the Setup page, you will find a new tab called Similarity Mod.

Click on this Tab. A page like the one displayed below will be shown.

Sim-search-pre-prepare.jpg

This page contains sentences like "Table prepared for people: None". This means that the database has not been prepared for the Mod.

In this page you may select up to three events that will be handled by the Mod. These should be events that contain person names in the Detail field of the event. For example, I use the Alias event to store alternative names for a person and the Married name custom event to store a woman's married names.

Hit the Prepare button and the database will be modified for use by the Mod.

After some time the page below will be displayed. Notice that if the database is large this process make take several minutes. I have a database with about 12,000 individuals and 1,500 places. For this database the process takes between one and five minutes depending on the server being used.


Sim-search-prepare-finish.jpg

If you go back to the setup, the page below will be displayed. Sentences like "Table prepared for people: tng_people" will be displayed, informing the the database is prepared for use by the Mod.


Sim-search-after-prepare.jpg

Upgrade from previous versions

If you are upgrading from version 10.0.0.2 to a more current version, the database remains the same.

However, if you are upgrading from a version previous to 10.0.0.2 you will have to prepare your database again. Follow this procedure:

  • Go to Setup in the TNG Administration page (Info >> Administration >> Setup).
  • At the Setup page, you will find the tab called Similarity Mod.
  • Click on this Tab. Click on the "Delete" button and confirm that you want to delete all tables.
  • After some time, a page will be displayed confirming that the tables have been deleted.

You may perform this steps either by using an old Mod version or using the more recent (10.0.0.2 or higher) version.

After deleting the tables, prepare the database again as described in the previous section.

Mod removal

In order to remove the Mod, you must first delete the information stored by the mod in the database.

Go to the Septup page of the Mod in the TNG Administration page (Info >> Administration >> Setup). A page like the one above will be displayed. Hit the "Delete" button.

After that you may uninstall the Mod as usual by using the Mod Manager.

'Be aware that if you don't follow the above steps, triggers will be left in your database which will cause errors such as "Table 'tng.qgrams_grams' doesn't exist" should you subsequently delete the tables by other means. These can be hard to track down.

Translations

The Mod already contains translations to French, German and Brazilian Portuguese.

Developer

Mod developer is Rovian Veronez with changes by Carlos Heuser. French translation provided by André Morel

Files changed

  • genlib.php
  • js/selectutils.js
  • finditems.php
  • admin_setup.php
  • languages/English/cust_text.php
  • languages/English-UTF8/cust_text.php
  • languages/PortugueseBR/cust_text.php
  • languages/PortugueseBR_UTF8/cust_text.php
  • languages/German/cust_text.php
  • languages/German_UTF8/cust_text.php
  • languages/English/admintext.php
  • languages/English-UTF8/admintext.php
  • languages/PortugueseBR/admintext.php
  • languages/PortugueseBR_UTF8/admintext.php
  • languages/German/admintext.php
  • languages/German_UTF8/admintext.php

Files added

  • qgrams_admin_delete_database.php
  • qgrams_admin_prepare_database.php
  • qgrams_delete_tabledefs.php
  • qgrams_findplaces.php
  • qgrams_findssform.php
  • qgrams_tabledefs.php


Revision History

Version Date Contents
v11.0.2.0 28 Sep 2016
  • Updated for TNG v11.0.2
v11.0.0.0 17 Mar 2016
  • Updated for TNG v11.0
v10.1.2.1 13 Mar 2016
  • French translation included (thanks to André Morel)
v10.1.2.0 22 Jul 2015
  • Tested for TNG v10.1.2
  • Minor error correction in css file: Mod was disabling the menu icon for sources
v10.1.0.0 26 Jan 2015
  • Updated for TNG V10.1
v10.0.0.2 18 May 2014
  • Previous versions allowed querying for max two words. In this version as many words as required may be entered as a query.
  • The name of the tree is displayed in the result.
v10.0.0.1 11 Mar 2014
  • Database preparation works faster and uses less server memory.
  • For large databases, preparation was giving an error "table2 full". Fixed
  • Installations using TNG table names different from the default were not supported. Fixed.
  • Query results could contain spurious lines. Fixed
v10.0.0.0 13 Jan 2014 Changed for compatibility with TNGv10
v9.2.2.2 13 Jan 2014 Fixed a problem in Javascript that was causing conflicts with other mods. This will probably be the last update for TNGv9
v9.2.2.1 15 Nov 2013 Fixed the way event type names are displayed in the setup page. Changed the Search link to point to the Mod search form. Inserted a link in the Mod search form pointing to the regular TNG search form.
v9.2.2.0 29 Oct 2013 Mod realease

Sites using this mod

If you download and install this mod, please add your TNG site to the table below.

URL User Note Mod-Version/TNG-Version User-language
http://heuser.pro.br/ Carlos Heuser Developers site 11.0.2.0 for TNG v12.0 English, German, Brazilian Portuguese