Mar
17
2008

Link spam and sausage?

A new form of spam has infiltrated the inbox for this site, and it is sort of interesting.

The idea is to send a bunch of sentence fragments of with 20-60 links, aggregated from various sources (payday loans, male enhancements and the like) into  a single comment.  This is an attempt to take advantage of Googe’s link ranking algorithm – it would be as if I tried to boost my rank by creating many web sites with links to my pages.

A similar approach is to put a template sentence with a few keywords from the post and say how useful it is.

This will fool a Baysean network because it will look a lot like English -  in that it agrees with prior probabilities taken over a sufficiently narrow range of sentences, but it doesn’t quite make sense when a human reads it.

Actually it reads exactly like students work when the “write by Google”, – i.e. cut and paste from websites.

I’m wondering if a better natural language processing tool would be a good idea as a spam filter.  It might even make my life easier.

Written by Rob in: pedagogy,security |

1 Comment »

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress | Aeros Theme | TheBuckmaker.com WordPress Themes