If you don’t monitor all online sources you won’t find all the news available out there. Based on our unique technology for monitoring the Internet, Scanmine can provide the most in-depth up-to-date coverage of any topic and in multiple languages.
All sites that are relevant to your topic are monitored. In addition, all other general newspapers, magazines, blogs, corporate and governmental sites are scanned for articles that may be relevant for you and your industry.
We focus on sites that matters in your industry
Some sources are more important to you and your industry than others. New and important stories are often posted on smaller sites before being picked up by major newspapers. If you want to be ahead of the game you should know before everyone knows.
This is how it works
Scanmine is not limited by the RSS feeds each site makes available of its content. By interpreting web pages the same way a browser does, we can analyze the implicit content structures and automatically identify and extract articles and other information from a web page.
Searching for new articles
Thousands of web sites with article “teaser” pages (A) that contain teasers to articles are continuously monitored in search of new articles. For each site Scanmine analyses the structure of web pages, identifies new teasers (red box) and follow links to the article page (B). Sometimes sites post links to relevant articles on sites not initially monitored by Scanmine. By following such links, Scanmine can even pick up relevant articles from sites not initially set up for monitoring.
Finding the whole article
On the article page, Scanmine separates the article from the rest of the content on the page, and finds various elements of the article such as the headline, byline, publishing date, summary or abstract and pictures with captions (D). Unlike the monitoring of RSS feeds, this allows us not only to find articles earlier but use the structure of the whole article in our analysis.
Finding other types of information
Sometimes you might be interested in information found in listings and tables (C), such as various overviews of events or technologies found on multiple sites. Scanmine can also monitor, identify and update specific information from multiple sites, even though they may have different formats.