From behind the WorldSnowboardGuide
Tales of independent publishing

Aug
22

I added Google Analytics a few months back on the worldsnowboardguide.com with a view to replace SmarterStats which is hosted on our server. I’m looking at replacing many of the things that are running off the server with hosted solutions as its always running at capacity and slowing the site down.

There’s 2 main things i’ve always looked at on any of the stats programs, namely refering sites & keywords, and i’ve noticed some unanticipated keywords to say the least …

The 61st most popular keyword on WSG yesterday was big tit huge boob porn star needs a plumber hard..and gets it!! and tap that into Google and there it is, 6th on the search results. The 10th most popular keyword over the last month was lolly badcock & michelle b’s snog-fest and it appears as the 11th search result in Google.

Now the culprit here is youtube. I thought i was being very clever when i added a video page to all the resorts. I used the Youtube API to do a lookup on a few keywords and then display the results. The first video gets shown and then i list all the other videos that match those keywords, click on a video and it then plays within the page, all nice and AJAX’y. The tags & comments on the video are also displayed, and i figured i should expand the functionality, so if you click on one of the keyword tags it will then reload the page and show videos matching that tag, using URL parameters.

I appreciated that if someone changed the URL parameter on the page, then it would show any videos matching that keyword e.g. www.worldsnowboardguide.com/resorts/switzerland/anzere/resort_video.cfm?tags=boobs but i had no idea that somehow these would eventually start to appear in search engine results, unless any other site added it as a link, and to the best of my knowledge this hasn’t happened.

Not what i was orginally intending ...

Not what i was orginally intending ...

If anyone can explain how this has happened then please enlighten me, and in the meantime i’ll be adjusting the code …

Jul
23

It’s high time I gave an update of where I am regarding my previous post on the illegal spidering and content copying.

First the spidering instance. I was easily able to work out which ISP owned the offending IP address using cqcounter.com and I sent an email to the address quoted in the ‘abuse-mailbox’ section, complete with the IIS log files and my own log files, pretty much everything i could find relating to this incident. To my amazement 2 days later i received an email back from BT, and here is an excerpt:

Thank you for your email dated 9th April to the BT Customer Security Team. I am sorry to hear about the inconvenience you have had with a DoS attack, I know how frustrating this can be. May I take this opportunity to thank you for forwarding your logs.

For your information, Port scanning contravenes BT’s Acceptable Usage Policy and Terms & Conditions. BT takes any abuse of its service very seriously.

I have carried out an investigation into this and I have taken action against our user to stop this happening again.

Pretty impressive stuff BT and very polite! Obviously my curiosity in wondering what BT did to stop this user from doing it again, lead me to send them another email, but i never received any further communication from them. I’ve yet to receive any traffic from that IP address again, but that’s not unusual in these circumstances anyway.

Now onto the content copying. Following the advice on the Copyscape website regarding responding to plagiarism i contacted the website owner and received the following reply (i have changed the name to protect the un-inocent):

SCUMBAG is primarily an online community and thus the resorts sections have been outsourced with an agreement with a third party that content is original or taken from authorized sources. Upon completion of the project, resorts were checked randomly to ensure there is no copied content (especially photographs).

We do not desire content from unauthorized sites since all content(description and stats) is easily available and updated live on other authorized sources and we paid for proper content.

Thus we are already checking the descriptions you mention and the descriptions of all resorts on SCUMBAG (and pictures) to ensure that there are no problems with your site.

Fair enough, if they remove all of the offending content then i’ll let this pass as I have much better things to do with my time. I asked them which 3rd party they bought their (my!) content from and received a reply back from them:

The third party is an independent company which we authorized to do this job. As I have explained we will be checking all descriptions to make sure they do not conflict with WSG. If you need to initiate legal proceedings, then you should do so against us and if we deem it appropriate, we will move legally against the third party

I’ve since checked repeatidly the SCUMBAG website and they have re-worded most of the descriptions, however the same statistics are still there and they are still using the majority of the photographs that are under WSG copyright. All far from satisfactory.

Next on the Copyscape list, was to contact the hosting company of SCUMBAG. I did repeatidly and got no reply. Also on the list is to contact all the various search engines telling them of a DCMA infringement, the idea of this is that the search engines will then drop SCUMBAG off their search results. This has been done, but the process is incredibly slow to see any results.

Own-it.org came back to me with a reply:

You have done the right thing in contacting Google. You should also contact the ISP hosting the Cyprus Company and ask them to take down the infringing material or close the site.
You should contact the Cypriot company detailing al the remaining infringements and ask them to take them down immediately . You should tell them you will take legal action if they fail to do so. You could also threaten legal action against them to make them reveal the name of the person who supplied them with the material. You might also want to contact a Cypriot IP lawyer and see if they would work on a “No win- No Fee” basis to make a claim of copyright infringement against the Cypriot site. Good luck.

Now this is the problem. There are stacks of websites offering all sorts of advice of what to do in order to get them to remove the content, but when they don’t remove the content, there very few companies which will pursue this for you and nothing in the way of state aid to help pay for this.

Under advice, I need to be careful about revealing information against the new legal proceedings, but an update will follow in due course.

Apr
10

Yesterday was interesting. I got back from my part time day job (developing sharepoint applications in case you’re interested) and checked the site activity on the worldsnowboardguide.com. The stats were way up on what they should have been, and when i looked deeper into it there was a huge number of page views from a specific IP address which wasn’t one my many recognised search engines that hammer the site. The browser user agent didn’t have anything to suggest it was a bot, but in 45 minutes they had hit over 11,000 pages on the site. It didn’t seem to be a denial of service attack, but there was very few repeated pages in the log.

I did a whois lookup on the IP address on cqcounter.com for the offending IP address 81.153.117.51 and have sent all the info to the host BT and we’ll see what comes back. I’m now spending my Easter weekend, doning  my Coldfusion hat and changing the code, so that in future it can detect and prevent this kind of stuff. If i get the chance i’ll try and package things up nicely and stick onto sourceforge.net as i couldn’t find anything that did such a thing.

This kind of thing isn’t that unusual, but what came next was, and turned me into a screaming nut job and I am so angry about this. Part of my googling around trying to find a Coldfusion component that could detect these attacks I stumbled upon a blog on sitepoint.com that talked about a website called Copyscape.com which basically searches for copies of your content. All you do is stick in a URL and it tells you if any of the content on that page has been used on any other website. So i did this with a few URL’s of some of the more popular pages on the site and found some small sites and forums that had quoted some excerpts, most of which linked back to WSG and on the whole i’ve got no problem with this kind of thing as it gives us traffic. However one site kept on cropping up, skoarder.com and upon visiting their site, i flipped out.

Every single one of the resorts they cover includes a big chunk of the review from WSG, and it doesn’t stop there. Resort statistics seemed to have been copied and finally the ultimate proof, they’ve stolen a stack of photos. There is no mention of WSG, they haven’t asked permission, it is blatent theft and i’m not having it.

Here’s their review of St.Anton below (copied content shown in orange) and compare that with our St.Anton review and tell me if i’m paranoid.

Copied content of St.Anton

I then picked a really obscure resort i visited a few years back in Poland called Bialka and guess what … Again take a look at our review of Bialka and compare to theirs

Copied content of Bialka

This time not only have they copied some of the text, but this time also a photo I took of Pete.

Talk about caught red handed. So what do I do next? I’ve obviously written a pretty strong email to the owner and explained in no uncertain terms to remove the content immediately and i’ll be contacting own-it.org about starting criminal proceedings, so stay tuned to see what their response will be.  Copyscape.com do have some great advice about what to do next, and it’s certainly a site i’ll recommend to anyone who wants to see what thieving so and so’s are out there stealing your content.

Apr
10

First blog, first post. I’m sure everyone starts their blog like this, but I’ve been meaning to do this for quite a while now.  I guess a combination of an amazing winter season, and a genuine if bizare fear about what could happen when i start spouting all the things that are inside my head has kept me away from starting this, but here we go.

I’ll kick off by explaining the point of this, or rather what i’m not trying to do. This isn’t going to be a tale of where i’ve been snowboarding this season, in fact i’ll probably hardly mention anything about the sport I love. This is going to be about the business of running WSG Media; biting off far more than I can chew, running to stand still, scratching around for cash, but not wanting to change a single thing about this stupid choice i’ve made.

I really want to talk about what it is like to develop and run a website like the worldsnowboardguide, to independently produce and publish books, and still have to do normal work for a living. There’s so many things i’ve learnt over the last 5 years i’ve been doing this, and far too many things i’d wish i’d known before i’d taken on this venture.

As the winter season draws to a close, the real work has begun. As part of the new masterplan, WSG is set to produce 2 books this year, and many changes to the worldsnowboardguide are planned. It’s going to be a frantic time and not only will I be writing about the whole internal process but hopefully i’ll be getting some feedback from you to help me on my way.

Follow

Get every new post delivered to your Inbox.