Working in the Search sector for nearly a decade now, it would seem like this is an easy question to answer. However, to a degree, it is not. Why can it be such a vexing concern? As simply put as I can; it’s all in the context of the search, not so much the content.
You can (and should) build the more beautifully indexed body of content imaginable, but without relevance applied to a search, it’s use is limited at best. Identifying the context of the search is critical to determining the relevancy of the results.
A good example from a book recently read, illustrated the importance of relevancy signals to search. Take the case of two doctors searching the term ‘myocardial infarction’.
Doctor #1 is in his office, looking for recent research into the causes of sudden death from heart attacks. He’s most interested in the latest articles regarding predictive signals or other metrics that might more accurately predict the likelihood of an event in his patients.
Doctor #2 is in the ER, faced with a perplexing acute MI; and is in need of a procedure that will stabilize the patient as rapidly as possible. Perhaps there is a new class of drugs that can help, and he needs to know how to administer them, quickly, or even know if his facility caries the desired drug.
Both of these physicians may be searching on the same term, but they needs are strikingly different. Giving each of these doctors the right (relevant) results quickly can only occur with the application of signals. Signals might include the location of the search? Is it coming from a network in the ER, or the subnet in the offices? Are their other signals such as which doctor may be making the search? Some doctors are assigned to ERs, others may not ever rotate through. These are all signals that can impact the relevancy of a search.
So, how does the search engine know the context? It needs to be fed the signals so it can calculate the relevancy of each result to the search being conducted. These signals need to be identified and provided by an intervening layer that understands the context of the search. That context might be the geolocation or known assignment, or even the past search history of a specific user that includes information on which links they more often explore. All of these are signals that can be expressed to a search engine (or appliance) to improve relevancy.
How might these signals be expressed? Some examples, if I might diverge down a rabbit hole related to my own decade or so experience with Solr, would be applying boost formulas to specific fields containing results stored in certain fields or term sets in the body of content.
Perhaps for the ER doctor, documents with a field property of medical_procedure:myocardial medical_procedure:infarction would boost the score by a major factor, pushing how-to or other information to the top of the list.
In opposition to that, the other doctor’s search might look at boosting publication date (more recent, more boost) and other fields such as a flag indicating the content is research (e.g. research:true article:true). The way relevancy can be identified and improved will always require the application of some signals, of one type or another.
Other examples of signals that can be applied, might originate from the content curator. Any good search application needs to have SMEs (subject mater experts) and content curators continually updating the data and reviewing the results. Something a curator might do is apply a boost factor for products that are currently on sale, or have a ‘most popular’ rank that has recently changed. Maybe the product is over-stocked, or perhaps it was just discontinued. The curator is a critical pat of the relevancy solution. It can’t be left solely on the shoulders of the search engineering team to ‘guess’ that is relevant today, and what is not.
There needs to be a concentric circle of needs, so to speak, when considering the factoring and refactoring of the search (relevancy) solution. The layers (outer to inner) of this concentric feedback look might look like this.
Poor feedback / upset users / lost sales
Business and domain awareness
Content curation
Paired relevance tuning
Test-driven relevance.
The Takeaway
More important to a good search solution, and thus a good customer experience is a TEAM of people looking at the changing relevancy of content, tuning to meet the most urgent business and customer needs, and method of signaling the context of a search to ensure the most relevant results are returned, every time.