Content scraping is a technique of extracting content from websites and then posting it on other websites (or blogs) automatically via RSS.
I have seen content scraping happen on this blog as well.I installed a WordPress plugin called “RSS Footer” that adds an extra line of content to articles in my RSS feed.
Now it says “This post is originally posted at The Home Business Archive”, and a link back to my post.At least I get a link back
Other than that, I can´t think of a way to prevent content scraping.
These bloggers discuss content scraping:
Why blog content scraping is a pain. Why you shouldn’t ignore it …
Why blog content scraping and plagiarism is a pain, why you shouldn’t ignore it & what to do. A look at a recent example of our logo blog getting ‘scraped’ and why this growing plagiarism trend is an issue in the design community.
Google Maps Scraping for Content Online Radio / Podcast Network
Writing in his Understanding Google Maps and Local Search blog, Blumenthal takes a deep, technical look at Google’s new practice of scraping content from local news, review and event websites and applying that content to business …
Content Scraping: Is Someone Ripping Your Content AND Good Name?
Content scrapers are simple programs that scrape content on topic from sites or blogs for posting to the content scraper´s site. The sole purpose of content scrapers is to rip content, post it to a series of junk sites that are slathered with PPC and paid advertising, and make money on clickthroughs. …
Do you have any good ideas of how to prevent content scraping? Leave a comment.
{ 9 comments… read them below or add one }
Hey, Im just getting started with Wordpress and obviously have a lot to learn. Thanks for the info about scraping – Im going to read more on the blogs you’ve recommended.
You just have to insert a bot trap which logs and blog ip addresses. Hide a link to the bot trap in your menu then dynamically block them from the rest of your site. This is also really effective at blocking spam comments and other malicious bot activity.
It is also possible to block repeated requests from certain IP’s. Most users only read 4-20 pages per day. Bot will grab lots more than this so a simple log check with whitelist for search engine bots will do well.
Is it true that changing the content on your website frequently can help you keep a number of pages indexed, I have recently launched a website and am in the process of getting it indexed but the indexed pages seem to drop out ???
Hmmm – I dont like the sound of content scraping – sounds like a lot of websites would end up have a lot of duplicate content on them and their value would decrease massively…
.-= Racheal @ Kermit Costume´s last blog ..About Us =-.
The websites which have the best long term success are always those which are built around solid content – not scraping of content. A lot of people with low quality websites have seen their rankings dive with recent changes to the Google Algorithm, quality website haven’t had any problems however…
Fact is that if your content is indexed first, original (supposing that you didn’t scrape it from somewhere else
all other copies should be ranked lower then your website is in the search results.
If you manage to add some inner page links in your posts, dude that scraped your content will just give you bust in SE rankings.
Now, there is ethical question regarding scraping concept, but let us all be reasonable, ethic died long time ago, now the profit is everything thatch important.
Search a bit about content scraping, there are some quality ideas how to protect yourself and in the process get some free links
Hi Dominic, thanks for commenting! You are absolutely right! As long as you keep publishing 100% unique content, then the scraper will only help you getting more links.There is no reason why the search engines should index the exact same content twice.
After updation in Google Panda Algorithm, the importance of content is increase at a high level. New, unique and fresh content is worthwhile and there is no importance of duplicate, infect that would be a reason of downsizing in site traffic and ranking.
I just wonder how content scraping can be cubed or if it even should. At the end of the day, spreading the information is what matters only if references can be made to the original work.
{ 1 trackback }