New Media Filters Include the Human Touch, Not Just Web Crawlers - Crawlers and Algorithms
(Page 2 of 3 )
The Internet has a lot to offer, and sometimes when using search engines, we don’t know what we’re looking for. We may just want to read about news on subjects that generally interest us or search through topics of interest for new information. The problem, however, is that search engines are unable to filter results in a way that benefits us and narrows down information based on our needs and wants. That’s why topical web crawlers have become so popular; they perfectly address the limitations of popular search engines such as Google by distributing the “crawling process” across queries and users.
Essentially, web crawlers are programs that use the graph-like structure of the Web to move from page to page. When they were first being developed, these programs were often referred to as wanderers, robots, and worms. All of these words--and the new term “web crawlers”--make it seem as if these programs are incredibly slow and meandering, but that’s definitely not the case. These types of programs are incredibly fast; in some cases they’ve been known to scroll through tens of thousands of pages in just a few moments.
In the beginning, web crawlers were intended to be used as a way to retrieve web pages for the purpose of adding them to a repository. In turn, the repository would serve a particular purpose, such as being used by a search engine. In its most basic form, these crawlers begin with a seed page and then use the external links within the seed page to attend to other pages.
Before this new trend of relying on other people to help filter the web began, the overwhelming size and nature of the web made the need for continuous support and updating of web-based information very apparent. Crawlers facilitated this process by basically checking out hyperlinks in web pages in order to automatically download a partial snapshot of the Web. This snapshot would give a user an idea of the information available on the page, without having to go through the trouble of visiting the page and looking over all of the information themselves. Not only would that waste time, but it would force a user to look over information they weren’t interested in or that wasn’t useful to them.
Some information filtering systems rely on crawlers that constantly search the web, while other crawlers are more focused in the type of information they are seeking. The latter are referred to as “topical crawlers,” and they rely on algorithms to help find this very specified information. An algorithm is defined as a step-by-step problem-solving procedure that solves the problem presented in a finite number of steps. In other words, if a user is looking for a certain type of information or information about certain subjects, a mathematical equation will be created by a computer, and each step of that equation will help narrow the searching process and eventually result in the retrieval of the information needed.
Though people are now relying on the personal recommendations of their friends in their social networks for their retrieval of online media, it’s important to point out that this is most likely a trend that’s here to stay. Countless new and innovative web crawling applications are currently being developed or have yet to be invented. This means that in the future, consumers will be able to access relevant, interesting information more quickly than before -- and not only that, they will be able to specify what they’re looking for and narrow their searches in ways that are hyper-efficient.
That being said, it’s safe to assume that Internet users will always want the more personal, genuine touch of a human being when it comes to filtering the media they encounter, and especially when it comes to taking recommendations for new music, books, websites, and so forth.
What’s Happening Now
Simply put, most Internet users, no matter how long they’ve been surfing the web, don’t know how to find the information they’re looking for -- and if they do, they don’t know how to consistently find the content that’s most relevant to them. In other words, they spend a lot of time online looking, reading, and watching things that they didn’t set out to originally.
More web-savvy surfers, on the other hand, are learning how to seek out highly specialized sites that are able to aggregate the content that most appeals to them, using actual experience and human interaction from those who run the site. For example, these websites are able to recommend certain types of music or new bands to consumers based on their previous searches and viewing history.
Many Internet users who’ve become sick of unsatisfactory and irrelevant search results as a result of using search engines are beginning to demand that their favorite websites have this type of editorial layer. Essentially, Internet users need human input to sift through all the garbage and present them with what’s truly valuable and meaningful to them online.
More Website Marketing Articles
More By Joe Eitel