Web Clustering Engine
Do you know what a Web clustering engine is? Web clustering engine greatly simplifies the effort of the user from browsing the big set of search results by reorganizing them into smaller clusters. Current web clustering engines lead to extra clusters and misses out few relevant, resulting in lack of certainty of clustering outputs. Web clustering engines provides inconsistent results because the content of the cluster don’t always correspond to its label.<!–more–>
Search engines ar inevitable tool for retrieving information from the net. While the ranked results from search engines for sure search are definitely good, they’re less effective for sure queries particularly when they are short, ambiguous, polysemy. The volume of ranked results retrieved by search engines is overwhelming and includes the subtopics, different meanings for the given query.
Web clustering Engines are emerging trend in the field of data retrieval. They organize search results by topic, thus providing a complementary view to the flat ranked list returned by the standard search engines. The search results came back by traditional search engines on different subtopics or meanings of a query will be mixed along in the list so that the user might need to sift through a large variety of irrelevant items to find those of interest.
The Web clustering engines categorize the search results into different ranked groups/clusters and display those cluster labels. Therefore the user will find the required document very fast.
Now let us see some of the advantages of web clustering.
- It makes for shortcuts to the items that relate to a similar meaning. Since web clustering Engines cluster the search results having a similar meaning within same cluster it’s very simple for the user to search out similar documents. Thus the search time will be less.
- It permits better topic understanding. Since internet clustering Engines provides a high level view of the question. Also it’s helpful for informational searches in unknown or dynamic domains.
- It favors systematic exploration of search results. A clustering engine gives the summary of the content of many search leads to one single view on the primary result page. Also the user might review many potentially relevant results without the requirement to download and scroll to subsequent pages.
Web clustering Engine is a post-retrieval clustering that clusters the search results retrieved for the broad topic. In fact the bulk of internet queries are informational kind, expecting recent data for the given broad topics. The output of the web clustering engines ensures quick subtopic retrieval, fast topic exploration among unknown topics. And simple identification of relevant search results among the broad topic.
There are mainly 4 components for a web clustering engine. Now let us have a look at these components:
The system provides an interface to simply accept the search keyword from the user. In SRCluster, the input is a 1-gram polysemy word.
Search result acquisition:
This element preprocesses the search results. Usually the search results would be converted into a sequence of features through language recognition, tokenization, removal of stop words, stemming, but isn’t restricted to these steps.
The clustering engine component converts the preprocessed search results to a format appropriate for the clustering algorithm. Moreover Web clustering engine extracts the features and provides them as input to the clustering algorithm inside. The clustering algorithm would build the cluster and identifies the label that best describes every cluster.
Clustered search result:
The component presents the result to the user. The results are conferred in rich visual interface with the class listed and, when expanded the corresponding search results are presented to the user.