Training Junk Mail filter using Apple Mail and GMail IMAP Connector

Like most people, I get literally thousands of spam messages a month. I never see them, of course, because they\’re filtered out by Gmail\’s incredible spam filtering system.

This is all well and good if you\’ve got a Gmail account, but what if you want Google-quality filtering for your business or other mail accounts?

Well, thanks to Gmail IMAP support, we now have a massive and incredibly accurate data set for training Apple Mail\’s junk filter. Gmail IMAP lets you browse folders other than your inbox. So once you\’ve added it to your Apple Mail account list, simply browse to the Spam folder, select all, then mark those messages As Junk.

This simple act will train Mail\’s bayesian spam filtering system on everything in your GMail spam folder, no doubt instantly improving its performance somewhat. In my case, the spam folder usually has 3000-4000 messages in it, which is fairly hefty and it would certainly take a lot of clicking to get that kind of data by hand.

Thanks Google!

Jackrabbit, Wicket, Tomcat, Maven2… hell.

What follows is lessons learned migrating to the potentially magnificent Maven2 for dependency management.

Put <scope>provided</scope> on Tomcat shared resources in your pom.xml

If you deploy jars as a shared resource on Tomcat (i.e. put the jars in common/lib) then be sure to add the <scope>provided</scope> to those dependencies in your project\’s pom.xml. Otherwise, you\’ll get absolutely daft class-cast errors on shared resources like:


2007-10-21 12:42:19,425 ERROR 0-SNAPSHOT] - Servlet /myExample2-1.0-SNAPSHOT threw load() exception
java.lang.ClassCastException: org.apache.jackrabbit.core.jndi.BindableRepository cannot be cast to org.apache.jackrabbit.core.jndi.BindableRepository

Hahahahahahaha I think I want to kill myself. The problem is that Tomcat\’s shared libraries are loaded by a different classloader than your web-app\’s shared libraries (which is nice in a way, because it means you can use different versions of log4j or whatever).

So the lesson here is: Anything you want created by Tomcat and loaded by name (e.g. “jcr/repository”), be sure to exclude from your WEB-INF/lib when you deploy.

You can load the same shared resource by name for all apps

Deploying a Maven2-enabled app using Codehaus Mojo is a breeze… unless you want to deploy a context with it. And a context is the only way to load up named shared resources like a Jackrabbit repository. The solution?

$TOMCAT_HOME/conf/Catalina/$HOSTNAME/context.xml.shared

The contents are loaded for all contexts. Brilliant.

Class blahblah violates loader constraints

Oh no. This was awful. For me it was:


2007-10-21 13:16:26,331 ERROR 0-SNAPSHOT] - Exception starting filter DataServlet
java.lang.LinkageError: Class org/slf4j/ILoggerFactory violates loader constraints

I needed to scour the dependencies that Maven was loading into my webapp automatically and explicitly label them as provided.

Virtualising Magnolia CMS

In a sharp left-turn for the danwalmsley.com ouvre, what follows is instructions for virtualising Magnolia CMS. Magnolia is an open-source Content Management system available in both community and enterprise versions, and is elegant and easy-to-use.

If you want to deploy Magnolia for multiple clients across multiple virtual domains hosted through a single instance of Apache HTTPD and a single instance of Apache Tomcat, read on.

Virtualising Magnolia

Virtualises magnolia for an apache virtual host under / for root and /admin for admin stuff

What you get:

  • myclient.com: magnolia public content (i.e. everything normally under /magnoliaPublic)
  • myclient.com/admin: magnolia authoring interface (i.e. everything normally under /magnoliaAuthor)

Variables:
$CLIENT = client name, e.g. myclient
$CLIENT_HOSTNAME = client host name to be proxied, e.g. myclient.com
$CLIENT_PORT = selected port on which Tomcat should host the site, e.g. 8082.
$TOMCAT_HOME = installation directory of TOMCAT

Tomcat Config

mkdir $TOMCAT_HOME/webapps_$CLIENT

unzip magnoliaPublic.war to webapps_$CLIENT/ROOT
unzip magnoliaAuthor.war to webapps_$CLIENT/admin

cd $TOMCAT_HOME/conf/Catalina

mkdir $CLIENT_HOSTNAME
(e.g. mkdir myclient.com)

edit $TOMCAT_HOME/conf/server.xml to include the following elements:




      
        
        

      
]]>-->

Configure Apache

Enable mod_proxy

Create a config, $CLIENT.conf:

< ![CDATA[

    ServerName $CLIENT_HOSTNAME

    ProxyRequests Off
    
        Order deny,allow
        Allow from all
    

    ProxyPass / http://localhost:$CLIENT_PORT/
    ProxyPassReverse / http://localhost:$CLIENT_PORT/

    
        Order allow,deny
        Allow from all
    

]]>