Saturday, February 17, 2007

New food blog scraper site

I have found some of my content scraped onto this site (not using the real URL so they do not get link-goodness from this post) "cooking hyphen joy dot net".

They have not yet monetized (ads on site), as far as I can tell, but they may be working up to that.

You might want to keep an eye on it for your content (looks like its a mash-up from technorati tags). I can not find any contact info. Ihave not yet done a whois.. might when I have the time but was thinking I would wait until they monetized to complain.


This Post was written by Nika from Nika's Culinaria

6 comments:

FJKramer said...

I checked it out and almost every post is the first that pops up in a google blog search of the topic shown, and the few techorati tags I checked came up empty.

But you are in good company, here are just a few of the blogs that have been scraped:
Chowhound, Slashfood, SAvory Notebook, Something So Clever, Mr. Toads, Thought4Food, Lucullian Delights

Anonymous said...

french food blogs starts to have the same problem, it's terrible and it's really hard to fight.

Those reblogging platform often use the fact they simply display the content of your RSS feed.

One solution can be to avoid showing the full post in your rss, just an abstract or n first char of the post.

second rule : never make a link to those websites

Ilva said...

Thanks Laurent, being one of the above mentioned food bloggers I have now switched to short rss feed, I hope it will work!

Anonymous said...

Ilva: it seems like long time no hear! Does it ever turn cold and wintery there? Your photos look just as verdant as they do in the summer! You do not want to see what our day looks like. We have many inches of ice (not snow) on the ground and more snow falling.

Laurent: good idea tho I thought I remember reading somewhere that shortened RSS is less popular in some way (cant remember why .. need to google it).

Kalyn Denny said...

As some who reads hundreds of food blogs with RSS, I really hate short feed. I will tell you quite honestly, I rarely click through to read the rest when it's a short feed.
I'm not sure it's worth it because in my opinion, it doesn't really stop the scrapers anyway.

Anonymous said...

In this instance a "digest" RSS feed is not going to make a bit of difference - they're only using the search results which is an abstract anyway. It probably qualifies as fair use as much as Google results do.

I'm with Kalyn ... feeds are a courtesy and convenience to keep your readers. A mini feed is only good to alert your readers that you have a new post. You can do that with an email digest. I rarely click through on sites that use the digest version (unless it's a site that does A LOT of posting, like 20 in one day) so I'll dump them from my aggregator when they switch over.

All that said, that's not technically a scraper site, they're not reposting your entire articles, just making posts of search results with links. (It's when they scrape and don't link back that's especially nefarious.) The site may never be "monteized" in a way obvious to us, it may simply exist to create copious amounts of content for some other bot to use to increase someone else's rankings.

I did some poking around with Whois and DomainTools and if you wanted to bring these guys down, you'll have a long road ahead.