Morten, I see you used a regex for matching in your widefinder trial. But then you used split and list access. Why not modify the regex to do capturing of the necessary field(s) and go from there instead.
I downloaded your code and converted it to using addressRegExp.cap and saw on my system, a drop from 1.8s runtime to .6. I'm still validating the results, however as I got different counts than you in a few places. In at least one of those I got a 1 higher count, and on manual inspection, the higher count was correct.
Still digging into it. I've been following WF for quite some time as I use Qt4 to process a *lot* of log data. How much? Lets just say that gzipped the data comes close to a TB. ;)