ChemAnalyser – Relevance and Confidence

Evaluating search results

Do you know the feeling whether your patent search will produce too little of what you want or too much of what you don't want? Even if you will get a respectable hit list containing suitable chemical names, considered synonyms, and the CAS numbers of the requested  chemical substances. How can you be sure there aren’t a lot more chemical data or chemical structures out there which could be urgently needed to solve your problem?

Technically speaking, how can you evaluate the confidence and relevance of your current search results?

Cognitive semantic search engines – like the ones behind ChemAnalyser – got the answer. As they can „understand“ concepts, rather than just search for terms, they identify the possible meanings of a search term. Thus, confidence and relevance of your search results are tightly related to the underlying search tools.

The confidence value

In contrast to traditional search engines that use mathematical algorithms to match keywords with the corresponding results, cognition search engines analyze the search terms and identify the possible meanings. By recognizing the ontology (sense of a word), the morphology (different forms of a word), and the synonymy (relate words to concepts) more precise and confident search results are obtained. To evaluate the confidence, each recognized (annotated) search term – whether a chemical name, a CAS number or a considered synonym – is associated with a confidence value, ranging from zero to one.

How to use the confidence value

The amount of this value reflects the likelihood by which a search term was annotated with a particular meaning. For example, the term “grain” represents a homonym of two different meanings, and thus, may appear either in a nutrition-based or in a material science-based patent. To find the most suitable meaning, our context-sensitive recognition module assigns a particular meaning that is a function of our understanding of the total context. 

The confidence value may also be used in customized search: Use a high confidence value to extract highly certain facts and a low confidence value to get a broader range of search results but with a lower confidence level.

The relevance value

Each search term – regardless whether chemical nomenclature or a unique CAS number – is assigned with a relevance value reflecting the relevance of a search result. The relevance value depends on the whole document searched and mimics the human conception regarding a respective semantic meaning. As each search term may eventually appear with different synonyms and at different positions within a document, it will be annotated with a unique meaning ID number. Both the relevance value and the ID number will be displayed within the search mask. For example, appearance in the title, in the experimental section, multiplicity of occurrence, appearance of parent or semantic child terms enhances the relevance value of a search term. The relevance value facilitates the search process in finding the most relevant data and information needed.

One step ahead of the competition

In research, time is the most valueable resource. A sudden idea, the right information at the right time can decide between victory and defeat. Both the confidence and the relevance value make sure you get more relevant search results at a time. Find the right chemical formula or a suitable chemical synonym within minutes instead of hours and get the crucial advantage over your competitor. Make smarter decisions in less time and let your business shine – nothing could be simpler with ChemAnalyser.