|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectau.org.ala.checklist.lucene.CBIndexSearch
public class CBIndexSearch
The API used to perform a search on the CB Lucene Index. It follows the following algorithm when trying to find a match: 1. Search for a direct match for supplied name on the name field (with the optional rank provided). 2. Search for a match on the alternative name field (with optional rank) 3. Generate a searchable canonical name for the supplied name. Search for a match on the searchable canonical field using the generated name 4. Clean up the supplied name using the ECAT name parser. Repeat steps 1 to 3 on the clean name until a match is found 5. No match is found When a match is found the existence of homonyms are checked. Where a homonym exists, if the kingdom of the result does not match the supplied kingdom a HomonymException is thrown.
| Field Summary | |
|---|---|
static java.util.regex.Pattern |
affPattern
|
static java.util.regex.Pattern |
cfPattern
|
protected org.apache.commons.logging.Log |
log
|
protected TaxonNameSoundEx |
tnse
|
static java.util.regex.Pattern |
virusStopPattern
|
static java.util.regex.Pattern |
voucherRemovePattern
|
| Constructor Summary | |
|---|---|
CBIndexSearch()
|
|
CBIndexSearch(java.lang.String indexDirectory)
Creates a new name searcher. |
|
| Method Summary | |
|---|---|
void |
dumpSpecies()
Dumps a list of the species LSID's that are contained in the index. |
org.apache.lucene.search.TopDocs |
getIRMNGGenus(LinnaeanRankClassification cl,
RankType rank)
Multiple genus indicate that an unresolved homonym exists for the supplied search details. |
java.lang.String |
getPrimaryLsid(java.lang.String lsid)
Returns the primary LSID for the supplied lsid. |
void |
reopenReaders()
|
RankType |
resolveIRMNGHomonym(LinnaeanRankClassification cl,
RankType rank)
Attempt to resolve the homonym using the IRMNG index. |
java.lang.String |
searchForAcceptedLsidDefaultHandling(LinnaeanRankClassification cl,
boolean fuzzy)
Returns the accepted LSID for the supplied classification. |
java.lang.String |
searchForAcceptedLsidDefaultHandling(LinnaeanRankClassification cl,
boolean fuzzy,
boolean ignoreHomonyms)
|
NameSearchResult |
searchForAcceptedRecordDefaultHandling(LinnaeanRankClassification cl,
boolean fuzzy)
Returns the accepted result for the supplied classification. |
NameSearchResult |
searchForAcceptedRecordDefaultHandling(LinnaeanRankClassification cl,
boolean fuzzy,
boolean ignoreHomonym)
|
NameSearchResult |
searchForCommonName(java.lang.String name)
Performs a search on the supplied common name returning a NameSearchResult. |
java.lang.String |
searchForLSID(LinnaeanRankClassification cl,
boolean recursiveMatching)
Search for an LSID with the supplied classification without a fuzzy match. |
java.lang.String |
searchForLSID(java.lang.String name)
Searches for the name without using fuzzy name matching... |
java.lang.String |
searchForLSID(java.lang.String name,
boolean fuzzy)
Searches the index for the supplied name with or without fuzzy name matching. |
java.lang.String |
searchForLSID(java.lang.String name,
boolean fuzzy,
boolean ignoreHomonyms)
|
java.lang.String |
searchForLSID(java.lang.String name,
LinnaeanRankClassification cl,
RankType rank)
Search for an LSID based on suppled name, classification and rank without a fuzzy match... |
java.lang.String |
searchForLSID(java.lang.String name,
LinnaeanRankClassification cl,
RankType rank,
boolean fuzzy,
boolean ignoreHomonym)
Search for an LSID based on the supplied name, classification and rank with or without fuzzy name matching. |
java.lang.String |
searchForLSID(java.lang.String name,
RankType rank)
Searches for an LSID of the supplied name and rank without a fuzzy match... |
java.lang.String |
searchForLSID(java.lang.String name,
RankType rank,
boolean fuzzy)
Searches the index for the supplied name of the specified rank with or without fuzzy name matching. |
java.lang.String |
searchForLSID(java.lang.String name,
RankType rank,
boolean fuzzy,
boolean ignoreHomonyms)
Searches for the supplied name of the specified rank with or without fuzzy name matching. |
java.lang.String |
searchForLSID(java.lang.String name,
java.lang.String kingdom,
java.lang.String scientificName,
RankType rank)
Deprecated. Use #searchForLSID(java.lang.String, au.org.ala.data.model.LinnaeanRankClassification, au.org.ala.data.util.RankType, boolean) instead.
It is more extensible to supply a classification object then a list of higher classification |
java.lang.String |
searchForLsidById(java.lang.String id)
Gets the LSID for the record that has the supplied checklist bank id. |
java.lang.String |
searchForLSIDCommonName(java.lang.String commonName)
Performs a search on the common name index for the supplied name. |
NameSearchResult |
searchForRecord(LinnaeanRankClassification cl,
boolean recursiveMatching)
|
NameSearchResult |
searchForRecord(LinnaeanRankClassification cl,
boolean recursiveMatching,
boolean fuzzy)
|
NameSearchResult |
searchForRecord(LinnaeanRankClassification cl,
boolean recursiveMatching,
boolean addGuids,
boolean fuzzy)
Search for an LSID with the supplied classification without a fuzzy match. |
NameSearchResult |
searchForRecord(java.lang.String name,
LinnaeanRankClassification cl,
RankType rank)
Searches for a record based on the supplied name, classification and rank without fuzzy name matching |
NameSearchResult |
searchForRecord(java.lang.String name,
LinnaeanRankClassification cl,
RankType rank,
boolean fuzzy)
|
NameSearchResult |
searchForRecord(java.lang.String name,
LinnaeanRankClassification cl,
RankType rank,
boolean fuzzy,
boolean ignoreHomonyms)
Searches for a record based on the supplied name, rank and classification with or without fuzzy name matching. |
NameSearchResult |
searchForRecord(java.lang.String name,
RankType rank)
Searches index for the supplied name and rank without a fuzzy match. |
NameSearchResult |
searchForRecord(java.lang.String name,
RankType rank,
boolean fuzzy)
Searches the index for the supplied name of the specified rank. |
NameSearchResult |
searchForRecord(java.lang.String name,
java.lang.String kingdom,
java.lang.String genus,
RankType rank)
Deprecated. Use searchForRecord(java.lang.String, au.org.ala.data.model.LinnaeanRankClassification, au.org.ala.data.util.RankType, boolean) instead.
It is more extensible to supply a classification object then a list of higher classification |
NameSearchResult |
searchForRecordByID(java.lang.String id)
Returns the records that has the supplied checklist bank id |
NameSearchResult |
searchForRecordByLsid(java.lang.String lsid)
|
MetricsResultDTO |
searchForRecordMetrics(LinnaeanRankClassification cl,
boolean recursiveMatching)
|
MetricsResultDTO |
searchForRecordMetrics(LinnaeanRankClassification cl,
boolean recursiveMatching,
boolean fuzzy)
|
MetricsResultDTO |
searchForRecordMetrics(LinnaeanRankClassification cl,
boolean recursiveMatching,
boolean addGuids,
boolean fuzzy)
Search for a specific name returning extra metrics that can be reported as name match quality... |
MetricsResultDTO |
searchForRecordMetrics(LinnaeanRankClassification cl,
boolean recursiveMatching,
boolean addGuids,
boolean fuzzy,
boolean ignoreHomonym)
|
java.util.List<NameSearchResult> |
searchForRecords(java.lang.String name,
RankType rank,
boolean fuzzy)
Searches for records with the specified name and rank with or without fuzzy name matching |
java.util.List<NameSearchResult> |
searchForRecords(java.lang.String name,
RankType rank,
LinnaeanRankClassification cl,
int max)
Searches for a list of results for the supplied name, classification and rank without fuzzy match |
java.util.List<NameSearchResult> |
searchForRecords(java.lang.String name,
RankType rank,
LinnaeanRankClassification cl,
int max,
boolean fuzzy)
Searches for the records that satisfy the given conditions using the algorithm outlined in the class description. |
java.util.List<NameSearchResult> |
searchForRecords(java.lang.String name,
RankType rank,
LinnaeanRankClassification cl,
int max,
boolean fuzzy,
boolean ignoreHomonyms)
|
void |
updateClassificationWithGUID(LinnaeanRankClassification cl)
Updates the supplied classification so that the supplied ID's are substituted with GUIDs. |
NameSearchResult |
validateHomonymByAuthor(java.util.List<NameSearchResult> result,
java.lang.String name,
LinnaeanRankClassification cl)
|
NameSearchResult |
validateHomonyms(java.util.List<NameSearchResult> results,
java.lang.String name,
LinnaeanRankClassification cl)
Takes a result set that contains a homonym and then either throws a HomonymException or returns the first result that matches the supplied taxa. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected org.apache.commons.logging.Log log
protected TaxonNameSoundEx tnse
public static final java.util.regex.Pattern virusStopPattern
public static final java.util.regex.Pattern voucherRemovePattern
public static final java.util.regex.Pattern affPattern
public static final java.util.regex.Pattern cfPattern
| Constructor Detail |
|---|
public CBIndexSearch()
public CBIndexSearch(java.lang.String indexDirectory)
throws org.apache.lucene.index.CorruptIndexException,
java.io.IOException
indexDirectory - The directory that contains the CB and IRMNG index.
org.apache.lucene.index.CorruptIndexException
java.io.IOException| Method Detail |
|---|
public void reopenReaders()
public void dumpSpecies()
public java.lang.String searchForLSID(java.lang.String name,
boolean fuzzy)
throws SearchResultException
name - fuzzy - look for a fuzzy match
HomonymException - when an unresolved homonym is detected
SearchResultException
public java.lang.String searchForLSID(java.lang.String name,
boolean fuzzy,
boolean ignoreHomonyms)
throws SearchResultException
SearchResultException
public java.lang.String searchForLSID(java.lang.String name)
throws SearchResultException
name - scientific name for a taxon
SearchResultExceptionsearchForLSID(java.lang.String, boolean)
public java.lang.String searchForLSID(java.lang.String name,
RankType rank,
boolean fuzzy)
throws SearchResultException
name - rank - fuzzy - look for a fuzzy match
HomonymException - when an unresolved homonym is detected
SearchResultException
public java.lang.String searchForLSID(java.lang.String name,
RankType rank,
boolean fuzzy,
boolean ignoreHomonyms)
throws SearchResultException
name - rank - fuzzy - ignoreHomonyms -
SearchResultException
public java.lang.String searchForLSID(java.lang.String name,
RankType rank)
throws SearchResultException
name - rank -
SearchResultExceptionsearchForLSID(java.lang.String, au.org.ala.data.util.RankType, boolean)
@Deprecated
public java.lang.String searchForLSID(java.lang.String name,
java.lang.String kingdom,
java.lang.String scientificName,
RankType rank)
throws SearchResultException
#searchForLSID(java.lang.String, au.org.ala.data.model.LinnaeanRankClassification, au.org.ala.data.util.RankType, boolean) instead.
It is more extensible to supply a classification object then a list of higher classification
name - kingdom - genus - rank -
SearchResultException
public java.lang.String searchForLSID(java.lang.String name,
LinnaeanRankClassification cl,
RankType rank,
boolean fuzzy,
boolean ignoreHomonym)
throws SearchResultException
name - cl - The high taxa that form the classification for the search itemrank - fuzzy - look for a fuzzy match
HomonymException - When an unresolved homonym is detected
SearchResultException
public java.lang.String searchForLSID(LinnaeanRankClassification cl,
boolean recursiveMatching)
throws SearchResultException
cl - the classification to work with
SearchResultExceptionpublic void updateClassificationWithGUID(LinnaeanRankClassification cl)
cl -
public NameSearchResult searchForRecord(LinnaeanRankClassification cl,
boolean recursiveMatching)
throws SearchResultException
SearchResultException
public MetricsResultDTO searchForRecordMetrics(LinnaeanRankClassification cl,
boolean recursiveMatching)
throws SearchResultException
SearchResultException
public MetricsResultDTO searchForRecordMetrics(LinnaeanRankClassification cl,
boolean recursiveMatching,
boolean fuzzy)
throws SearchResultException
SearchResultException
public NameSearchResult searchForRecord(LinnaeanRankClassification cl,
boolean recursiveMatching,
boolean fuzzy)
throws SearchResultException
SearchResultException
public NameSearchResult searchForRecord(LinnaeanRankClassification cl,
boolean recursiveMatching,
boolean addGuids,
boolean fuzzy)
throws SearchResultException
cl - the classification to work withrecursiveMatching - whether to try matching to a higher taxon when leaf taxa matching fails
SearchResultException
public MetricsResultDTO searchForRecordMetrics(LinnaeanRankClassification cl,
boolean recursiveMatching,
boolean addGuids,
boolean fuzzy)
cl - recursiveMatching - addGuids - fuzzy -
public MetricsResultDTO searchForRecordMetrics(LinnaeanRankClassification cl,
boolean recursiveMatching,
boolean addGuids,
boolean fuzzy,
boolean ignoreHomonym)
public java.lang.String searchForLSID(java.lang.String name,
LinnaeanRankClassification cl,
RankType rank)
throws SearchResultException
name - cl - rank -
SearchResultException
public NameSearchResult searchForRecord(java.lang.String name,
RankType rank,
boolean fuzzy)
throws SearchResultException
name - rank - fuzzy - look for a fuzzy match
SearchResultException
public NameSearchResult searchForRecord(java.lang.String name,
RankType rank)
throws SearchResultException
name - rank -
SearchResultException
public java.lang.String searchForAcceptedLsidDefaultHandling(LinnaeanRankClassification cl,
boolean fuzzy)
cl - fuzzy -
public java.lang.String searchForAcceptedLsidDefaultHandling(LinnaeanRankClassification cl,
boolean fuzzy,
boolean ignoreHomonyms)
public NameSearchResult searchForAcceptedRecordDefaultHandling(LinnaeanRankClassification cl,
boolean fuzzy)
cl - fuzzy -
public NameSearchResult searchForAcceptedRecordDefaultHandling(LinnaeanRankClassification cl,
boolean fuzzy,
boolean ignoreHomonym)
@Deprecated
public NameSearchResult searchForRecord(java.lang.String name,
java.lang.String kingdom,
java.lang.String genus,
RankType rank)
throws SearchResultException
searchForRecord(java.lang.String, au.org.ala.data.model.LinnaeanRankClassification, au.org.ala.data.util.RankType, boolean) instead.
It is more extensible to supply a classification object then a list of higher classification
name - kingdom - genus - rank -
SearchResultException
public NameSearchResult searchForRecord(java.lang.String name,
LinnaeanRankClassification cl,
RankType rank,
boolean fuzzy)
throws SearchResultException
SearchResultException
public NameSearchResult searchForRecord(java.lang.String name,
LinnaeanRankClassification cl,
RankType rank,
boolean fuzzy,
boolean ignoreHomonyms)
throws SearchResultException
name - cl - rank - fuzzy -
SearchResultException
public NameSearchResult searchForRecord(java.lang.String name,
LinnaeanRankClassification cl,
RankType rank)
throws SearchResultException
name - cl - rank -
SearchResultExceptionpublic NameSearchResult searchForRecordByID(java.lang.String id)
id -
public java.lang.String searchForLsidById(java.lang.String id)
id -
public java.util.List<NameSearchResult> searchForRecords(java.lang.String name,
RankType rank,
boolean fuzzy)
throws SearchResultException
name - rank - fuzzy - search for a fuzzy match
SearchResultException
public java.util.List<NameSearchResult> searchForRecords(java.lang.String name,
RankType rank,
LinnaeanRankClassification cl,
int max)
throws SearchResultException
name - rank - cl - max -
SearchResultException
public java.util.List<NameSearchResult> searchForRecords(java.lang.String name,
RankType rank,
LinnaeanRankClassification cl,
int max,
boolean fuzzy)
throws SearchResultException
name - rank - kingdom - genus - max - The maximum number of results to returnfuzzy - search for a fuzzy match
SearchResultException
public java.util.List<NameSearchResult> searchForRecords(java.lang.String name,
RankType rank,
LinnaeanRankClassification cl,
int max,
boolean fuzzy,
boolean ignoreHomonyms)
throws SearchResultException
SearchResultException
public NameSearchResult validateHomonymByAuthor(java.util.List<NameSearchResult> result,
java.lang.String name,
LinnaeanRankClassification cl)
throws HomonymException
HomonymException
public NameSearchResult validateHomonyms(java.util.List<NameSearchResult> results,
java.lang.String name,
LinnaeanRankClassification cl)
throws HomonymException
results - k -
HomonymException
public org.apache.lucene.search.TopDocs getIRMNGGenus(LinnaeanRankClassification cl,
RankType rank)
cl - The classification to testrank - The rank level of the homonym being tested either RankType.GENUS or RankType.SPECIES
public RankType resolveIRMNGHomonym(LinnaeanRankClassification cl,
RankType rank)
throws HomonymException
cl - The classification used to determine the rank at which the homonym is resolvable
HomonymExceptionpublic java.lang.String searchForLSIDCommonName(java.lang.String commonName)
commonName -
public NameSearchResult searchForCommonName(java.lang.String name)
name -
public java.lang.String getPrimaryLsid(java.lang.String lsid)
lsid -
public NameSearchResult searchForRecordByLsid(java.lang.String lsid)
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||