How the langdetect library can easily be improved to accurately recognise luxembourgish text.

Continue reading “Detecting Luxembourgish using a spam filter and wikipedia”
Guillaume Rischard's irregular notes
How the langdetect library can easily be improved to accurately recognise luxembourgish text.

Continue reading “Detecting Luxembourgish using a spam filter and wikipedia”
Debian systems can accumulate a lot of cruft. Fortunately, it is quite easy to clean up your installed packages. Most people would probably be okay with this:
aptitude markauto '~i!~M!~nbuild-essential(~E|~prequired|~sdevel|~sinterpreters|~slibdevel|~slibs|~soldlibs|~sperl|~spython)'
You can then see which non-essential packages are installed with
aptitude search '~i!~M!?essential'
Hurricane Electric’s IPv6 certification is fun. The most difficult part was getting OVH to add an ipv6 glue for stereo.lu, their management interface simply doesn’t support it! They have the least expensive .lu domains around, but sometimes you get what you pay for.
« la finesse de ses bons mots et la gaîté de son rire ». Marrant non?
If you have chkrootkit installed on debian etch, you might be getting these messages:
/etc/cron.daily/chkrootkit:
The following suspicious files and directories were found:
/lib/init/rw/.ramfs
This is a bug in initscripts, which puts a hidden file in /lib/init/rw but shouldn’t. Until it is fixed, you can patch chkrootkit to get rid of that message.
Update: Apparently, chkrootkit also stumbles on /lib/init/rw/.mdadm. Adapt the patch as needed.