If you are a blogger, chances are, you have either dealt with spammers already, or will be doing so in the future when your blog becomes more popular. These days, spammers are using any means necessary to get their links on your blog. These tactics include
link filled comments, bogus pingbacks and bogus trackbacks. What I’m going to focus on within this article is deciding,
whether a pingback or trackback is coming from a legitimate blog or not.
The example I use in this post will be from a random site that is attributed to a bogus trackback url that was found on a
Mashable.com post. I won’t be directly linking to the example site because that is what those spamming bastards want. Determining whether a blog is
fake or
real is easy once you figure out the patterns. Granted, these patterns change from time to time, here is a collection of tactics I use to determine if a blog is fake or not.
Precautions First:
When you discover that someone has linked to your post, the first thing you should do before visiting the site to check it’s authenticity is to make sure you have
popup blocking software turned on as well as
anti-virus software. I use something called
Ad-Block-Plus which is an awesome FireFox extension. I highly recommend it. The reason for these precautions is that, it doesn’t take much for you to be infected with something. Especially if you run a Windows based machine that doesn’t have the latest security updates.
Checking The Theme:
The first thing to check for when visiting the source of the trackback URL is the
blogs theme. A lot of spammers will generate a blog with the default theme and in the case of WordPress, this theme is called
Kubrick. Here is an example of what I’m talking about.
Kubrick is actually a fantastic default theme for WordPress. Quite a lot of people end up using this theme. I also wanted to mention that spammers do use different themes other than Kubrick. In fact, I’ve noticed many of the sploggers are now using themes
other than Kubrick. This is when it’s time to evaluate the content of that particular site. But before we move on, I want to show you something that appears on this blog that should
never appear on
ANYONES blog.
Don’t worry, this is only an image. This is what I found on this particular example of a splog. If you were to click on this banner, you would probably be infected with some sort of
adware or
trojan even if you were protected by software. No blog should ever have an advertisement like this displayed on their blog. This is a dead give away to get the hell out of there before it’s too late.
Checking Out The Content:
Lets take a closer look at the content posted within the image up above. That post generated a
trackback URL on
Mashable.com, a very popular website covering
social-networking and all that jazz. A good score for the spammer as they are sure to receive some sort of traffic through that backlink. Within this image, the title of the post matches the title of the original post on Mashable. The next dead give away is the text “
By Charles“. There is no one on that blog by the name of
Charles. In my experience, the spammers software automatically places a
random name into the
Author Field of the post. This author name usually links to the original post but in this case, the author name is not linked.
Another suspect of a splog is the
related content. In the screenshot, you can see the title of the blog is
Social Sites News. And since they linked to Mashable, you would think this blog is about social-networking and web 2.0 stuff. So why then, is there a link near the top of the page, to an article about
Great Barrier Reef holds drug key to diseases. The reason is because, these spammers use software that resembles search engine spiders. They crawl content across the internet that contains a predefined list of keywords. Once an article is discovered that contains a keyword, the software scrapes the content, and then links to it, generating a trackback or pingback url. Here is some evidence that further substantiates my claims.
Each keyword this splog is targeting is labeled as a category. This is just a sample of the categories listed on this splog. I recognize the fact that there are bloggers out there that blog about
A LOT of different subjects and each one of those subjects can be a category. Thankfully, there are other attributes that play into the matter as to whether the site is legit or a splog.
Checking The URL:
Another key aspect I’ve discovered is the actual URL of the generated trackback/pingback. Any
common sense blogger now a days uses something called
SEO Friendly URLS. These are URL’s that are generated by your blogging software that make crawling your content via search engine spiders, easier. Most of the automated splogs I have come across are not using these types of URLs. Instead, they are using links like this
http://splogsite.com/?p=519 All of that stuff after the
/ is what you need to be concerned with. These are ugly, non helpful URL’s that any smart blogger will stay away from using.
99% of the sploggers I have encountered have links that look very similar to the one up above.
The Default Meta:
This part is directed more towards user’s of
WordPress than anything else. With each install of WordPress, the
META block is displayed by default in a sidebar. There are legitimate bloggers who leave this block in place, but the majority of people do not and for good reason. The first being a
security issue.
The
META block contains a
LOGIN link which takes you to the blog’s
administration login page. Taking this away, forces the user to manually browse to the login page to login. Sure, not the best security measure. But that stroke of inconvenience will help. At least you won’t have every
John Doe coming across your site, clicking the login link and playing around to see if they can guess your
username and password. My overall advice is to bookmark the login link, and then disable that block from being displayed or at least delete the
LOGIN link.
Conclusion:
This is by no means the
end all be all of ways of determining a legitimate blog from a splog. These are all tactics that I use for this blog in determining whether a trackback or a pingback is actually legitimate. I will admit, I did comment on a blog one time, thanking them for linking to me. At first glance, they looked pretty legitimate but instead, I found out they scraped the content of a Mashable post and published the entire article
word for word. Since the Mashable article linked to me, this splogger also linked to me. After that experience, I told myself that I would closely examine any other site that linked to me to determine it’s legitimacy.
If you feel up to taking on these bastards head on, you can check out a post that
Lorelle (
How to Stop Content Theft: The Best Tips ) published on her blog which has tips and suggestions on how to report these time wasters.
I wanted to take this time to remind you that as a blogger, it is
your responsibility to ensure that these crappy spammers don’t fill your blog with porn links, or links that would otherwise put your readers in danger. I’m sure Mashable tries to do a good job at combating spam and deleting bogus trackback URL’s, but as my example up above shows, they can’t get every one of them. As a reader, if I were to click a URL on Mashable.com which clearly looked related to the article in question, and that site ended up infecting me, I sure as hell would hold Mashable.com responsible for the infection.
Wouldn’t you? If every blogger did
their part with
their own blogs to combat this problem, I’m pretty sure that spamming blogs would become a business model not worth pursuing.