Content filters

Over the last few weeks, I have been experimenting with various content filters. The experiment is mainly geared towards a crazy idea of mine – blocking a few URIs I tend to spend lot of time on! Since the user we’re blocking has root access :) , the more number of steps it takes to disable, the better it is!

I will put down a list of items I tried (in the order increasing complexity).

Firefox content blocker

Procon-latte absolutely rocks. Set the appropriate filters. And then set the password blind-folded(blind password is not recommended, it will render firefox useless). Pretty simple :)

  • Close your eyes and type few random keys into firefox search bar, ctrl+a(select all), ctrl+x (cut).
  • Paste the cut text into procon latte password field twice.
  • You’re done. Firefox is blocked. Beware this will make your firefox unusable, you won’t be able to modify any procon settings later.

So procon latte actually blocks based on text, so search engine results are also blocked. Also I use Opera as my primary browser, so we are sorta back to square one!

Will repeat, blind password is not recommended. Do it, if you know what you are doing.

The hosts file

Fill in the black list URIs in /etc/hosts, redirect them to 127.0.0.1. E.g.

#<ip-address>   <hostname.domain.org>   <hostname>
127.0.0.1       foobar.com              www.foobar.com

Make things little tough for root to modify:

sudo chattr +i /etc/hosts

This command will block modifications to the file at file system level. So the hacker has to do a chattr -i before editing the hosts file.

Not good. I again broke this. Time for new approach.

Block at DNS

I do use a dns cache on my localhost. I set it up to use opendns, and then block the related URIs. In this a dns query for the URI will return the opendns bad URI ip 208.67.219.130.

There is a problem, my ISP uses dhcp and it updates the nameservers for each connect. The chattr blocks that modification. But /me does this:

dhcpcd -x                   # close any existing dhcp connection
chattr -i /etc/resolv.conf  # allow editing this file
dhcpcd                      # fetch me the old ip addr and updates my nameservers in resolv.conf

OK there is the dhcpcd.conf option to not modify /etc/resolv.conf. That’s still easy.

WTF! It’s impossible. Thou art r00t.

The above step pretty much works, just that I need to block the dns changes. What the heck, time to figure out something in terms of those TCP/UDP/IP packets.

So all dns queries go as UDP packets with port 53. Why not block them at my system itself? Here are the iptables oneliners:

sudo iptables -I OUTPUT 1 -p udp --dport 53 -j REJECT        # reject all outgoing packets on port 53
sudo iptables -I OUTPUT 1 -p tcp --dport 53 -j REJECT

# Allow outgoing connections to opendns nameservers only :)
sudo iptables -I OUTPUT 1 -p udp -d 208.67.222.222 --dport 53 -j ACCEPT
sudo iptables -I OUTPUT 1 -p tcp -d 208.67.222.222 --dport 53 -j ACCEPT
sudo iptables -I OUTPUT 1 -p udp -d 208.67.220.220 --dport 53 -j ACCEPT
sudo iptables -I OUTPUT 1 -p tcp -d 208.67.220.220 --dport 53 -j ACCEPT
sudo iptables -I OUTPUT 1 -p udp -d 127.0.0.1 --dport 53 -j ACCEPT
sudo iptables -I OUTPUT 1 -p tcp -d 127.0.0.1 --dport 53 -j ACCEPT

The options are pretty much self explanatory (-I insert, 1 to the first position, -p packet type, -d destination, -j jump target). To verify if the rules work, try iptables -nvL, it will show you how many packets are dropped.

I will probably block opendns.org configuration too, or use the blindfold password trick with that!

More possibilities

I could’ve tried dansguardian with squid. But somehow, it looked like an overuse of system resources to stop a single person, and gosh I will have access to sudo /etc/rc.d/dansguardian stop. Whatever!

Actually it looks like Opera has a kiosk mode where you can specify a filter to block all websites. I use this for a simple adblock strategy. I also had thoughts around writing a script to fetch the Shalla’s blacklists and append them to urlfilter.ini for opera. The problem is with a few thousand websites, it will be definitely a pain later for normal browsing.

That’s the current setup. Fighting with /me is fun ;)

Input hotplugging and usb mouse

After loosing few matches on FICS on time control, poor me realized that it’s the mouse which matters when you’ve 10-20 seconds to mate. So I got myself an usb mouse and we’ve some new adventure.

Ok, plugged in the usb mouse, checked that lsusb recognizes it. A dmesg suggested only a generic “low speed” usb device plug in. I did a X -configure but it didn’t show up my usb mouse :( Few web pages suggested some cool ways to check if your mouse is recognized:

  • Do a cat /proc/bus/input/devices and look for your mouse. Get mouse model from lsusb
  • If there are multiple mice showing up in /dev/input/* then do a cat /dev/input/mice and try moving your mouse, if it’s recognized you’d see random output on screen

None of these worked for me. I gave up on XOrg.conf. Let’s try upgrading to Xorg input hotplugging and see if HAL does some magic. So added hal and dbus to rc.conf, pacman’ed xf86-input-evdev. Hotplugging totally disregards can replace/compliemt your xorg.conf, all inputs are added at runtime, real hotness! Well still my mouse was not working.

After pulling out hair after hair, finally I figured out the whole story (yep! I still have a few hairs left:)). Although kernel was recognizing my usb device, it couldn’t know that it was a mouse. Hence there was no device file created for the mouse in /dev/input. The missing link was usbhid (USB Human Interaction Device) module. A quick modprobe usbhid and we’re golden. The clever /me had disabled MOD_AUTOLOAD in rc.conf, and I had listed every module loaded by my system (optimization!).

Anyways hotplugging is super awesome. I enabled scroll on my touchpad this time, configuration is real easy and human readable. E.g. to disable the caps lock key, just add the following option to /etc/hal/fdi/policy/10-keymap.fdi: <merge key="input.xkb.options" type="string">ctrl:nocaps</merge>. Thanks to the arch wiki.

Update:.xmodmap causes problems with multimedia keys. Evdev is intelligent enough to guess the right names for all key codes, so commenting .xmodmap worked for me. Most of issues are caused because X detects two keyboards (thanks HAL), /var/log/XOrg.0.log is your friend. My current setup doesn’t use a xorg.conf, everything is detected by HAL, and things work OOB.

Good day!

Ignore channel join/part messages in irssi

Add the following block into ~/.irssi/config:

ignores = (
{
level = "JOINS PARTS QUITS";
channels = ( "#archlinux", "#vim", "#powershell" );
network = "FreeNode";
},
{
level = "MODES";
channels = ( "&bitlbee" );
network = "BitlBee";
}
);

This will automatically block the level message types from the channels specified. If you prefer manual way of ignoring messages try /help ignore. On a related note, activity_hide_level is a variable which can be used to disable the notification in act bar for specific message levels. Take a look at this nice irssi config file for examples of these.

Here is a list of various message levels for reference (you can find it in /help levels):

Message levels (or in short, levels) are used almost everywhere. They describe what kind of messages we're dealing with. Here's a list of them all:

CRAP - Can be almost anything
MSGS - Private messages
PUBLIC - Public messages in channel
NOTICES - Notices
SNOTES - Server notices
CTCPS - CTCP messages
ACTIONS - Actions (/me) - usually ORed with PUBLIC or MSGS
JOINS - Someone joins a channel
PARTS - Someone parts a channel
QUITS - Someone quits IRC
KICKS - Someone gets kicked from channel
MODES - Channel mode is changed
TOPICS - Channel topic is changed
WALLOPS - Wallop is received
INVITES - Invite is received
NICKS - Someone changes nick
DCC - DCC related messages
DCCMSGS - DCC chat messages
CLIENTNOTICE - Irssi's notices
CLIENTERROR - Irssi's error messages
CLIENTCRAP - Some other messages from Irssi

And a few special ones that could be included with the levels above:

HILIGHT - Text is highlighted
NOHILIGHT - Don't check highlighting for this message
NO_ACT - Don't trigger channel activity when printing
this message
NEVER - Never ignore or log this message

Debugging Procmail Recipes

Three steps to nirvana:

  • Modify mail fetch frequency in .fetchmailrc to a small value. E.g. In my .fetchmailrc, I forced fetchmail to query server every 30s:
    set daemon 30
  • Set the LOGFILE and LOGABSTRACT variables in .procmailrc. The one liner here is, LOGFILE specified where procmail would log the rules processed and LOGABSTRACT controls the verbosity of information logged. So we modify .procmailrc to:

    LOGFILE=$HOME/.logs/procmaillog
    LOGABSTRACT=all
  • Disable the catch-all recipe!
  • Still stuck, use printfs :-) In recipes, put in some echo "foo" >> ~/mydebug.log to check control flow.

BTW, I’m sure there are better ways to do this, /me had to send around 7-8 test mails to fix an issue! Feel free to share your recipe-debugging-tips.

A wrapper for osd_cat

Few months ago, we discussed about a mail notification method for fetchmail/procmail and mutt combo. As you can see in that post, the script we wrote to handle notification was dependent on ratpoison. Well it makes life bit tough for my non-ratpoison X sessions. I thought it would be a good idea to make the script window manager independent. Enter osd.sh

The script uses osd_cat. It comes in the XOSD package for most of the unices. It is similar to the cat command for the console. It can read a file, string or the stdin and output the text onto X display. It has a bunch of config options which let you choose font, location etc. for the text. I’d recommend a RTM (man osd_cat).

Here is the code:

#!/bin/bash
# Displays a string on the screen
# Last Modified: Mon 22 Jun 2009 02:59:55 AM IST

# let osd know, we have a X running. there will be problems if this script is
# called from different user. disable acl in X in that case(use xhost +).
export DISPLAY=":0.0"

color="red"
font="-*-dina-medium-r-normal-*-16-*-*-*-*-*-*-*"
age="6"
align="center"
delay="4"
indent="0"
lines="5"
offset="0"
shadow="1"
pos="middle"

# read from stdin if no args are present
if [ $# -ne 1 ]
then
text=$(line)
else
text=$1
fi

echo $text | osd_cat --color=$color --delay=$delay --age=$age --font=$font --offset=$offset --shadow=\
$shadow --lines=$lines --align=$align --pos=$pos

This script can be used for any notification purpose. For using this with procmail, replace use the following construct in procmailrc:

:0 h
| grep "From:" | /home/arun/bin/osd.sh

This will display alerts as “From: Foo Bar ” when a new mail is received.

Enjoy!

C# and VIM

Let’s explore some of the tweaks to improve C# experience in Vim. All of the tips in this article can be applied to .Net Framework/Windows and Mono/Linux combination.

  • Code folding

    The default syntax mode code folding doesn’t play well with C#. It allows folding of only the code between the #region tags. Alternatively we can take advantage of the indentation in the source code to make vim aware of which parts of the code to fold. Put the following snippet in your vimrc:
    " Folding : http://vim.wikia.com/wiki/Syntax-based_folding, see comment by Ostrygen au FileType cs set omnifunc=syntaxcomplete#Complete
    au FileType cs set foldmethod=marker
    au FileType cs set foldmarker={,}
    au FileType cs set foldtext=substitute(getline(v:foldstart),'{.*','{...}',)
    au FileType cs set foldlevelstart=2
     

  • Code browsing

    This is yet another important functionality when it comes to dealing with bulky libraries. We will use the awesome tool: Exuberant Ctags. With ctags we will scan the code base and create a tags file. And then we will make vim aware of the tags to help us autocomplete. Let’s start by scanning all c# files in d:\myproduct directory and creating the tag file d:\mytagfile.

    In a powershell window, navigate to the ctags installation directory and type in:
    .\ctags.exe --recurse -f d:\mytagfile --exclude="bin" --extra=+fq --fields=+ianmzS --c#-kinds=cimnp d:\myproduct

    For details on the various options used above, please see the ctags man page.

    Next, let’s make vim aware of our tag file. In your vimrc, include the following lines:
     set tag = d:\myproduct

    Done. Now let’s try out the navigation:
    Place cursor on any identifier in source code and press Ctrl-]. This will take you to the definition of that identifier. Sometimes there can be more than places where an identifier is defined (in different source context ofcourse). In such cases, use the shortcut g] to see a list of matching locations for a particular identifier. Vim will show you all matching identifiers and ask you where you want to navigate.

  • Code completion

    Vim supports a number of word completion mechanisms (see :help completion). I’ll touch upon the relevant ones for C# (or general programming for that matter).  These keys are applicable in Insert or Replace mode.

    • Tags based completion: Ctrl-x Ctrl-]
      Here vim will suggest you various words based on the tag file included.
    • Current and included files: Ctrl-x Ctrl-i
      Vim will search for relevant words on all included files and the source file.
    • Omni completion: Ctrl-x Ctrl-o
      Here vim will guess what will be the word based on the context (Type, member etc..). If you’ve noticed, we defined the omnifunc for C# in Code folding section:
      au FileType cs set omnifunc=syntaxcomplete#Complete

      This will tell vim to guess the words based on C# syntax.

    Tag completion in VIM

  • Quickfix

    Here’s a small tip to make vim recognize the error messages thrown by the commandline build tool msbuild. Put this in your vimrc:
    " Quickfix mode: command line msbuild error format
    au FileType cs set errorformat=\ %#%f(%l\\\,%c):\ error\ CS%n:\ %m

    Can you figure out a way to build C# projects in the commandline? (Hints: Vim :help make, search online for msbuild, devenv /help).

Happy Vimming :)

Global command in VIM

Well you guessed it right, the “global” command (or :g in command mode) will let you select lines that match a particular pattern globally in the file and lets you operate on the selected text. The :v operates similar to grep -v which selects all texts which do not match the pattern.

Syntax: :g/pattern/command where pattern is any regular expression, command is any vim command

Here’s the scenario which taught me the global command:
I have a file with ~29K lines. All lines must begin with “\\”. However somelines are broken into two, in which case the lines which do not start with “\\” need to be appended to the previous
line
Sample text:
\\foo\bar
\\food
bar
\\x\y
\z

Should be converted to:
\\foo\bar
\\foodbar
\\x\y\z

And we solved it this way:
:v/^\\\\/exe "normal i\<C-H>\<ESC>"

All we did was to execute a backspace(C-H) for all occurences of pattern in normal mode

Let’s end the post with a theorem:
Statement:VI is perfect
Proof: VI in roman numerals is 6. The natural numbers less than 6 which
divide 6 are 1, 2, and 3. 1 + 2 + 3 = 6. So 6 is a perfect number. Therefore, vi
is perfect.
– Arthur Tateishi

And a corollary
VIM in roman numerals might be: (1000 – (5 + 1)) = 994, which happens to be
equal to 2*496+2. 496 is divisible by 1, 2, 4, 8, 16, 31, 62, 124, and 248 and
1+2+4+8+16+31+62+124+248 = 496. So, 496 is a perfect number. Therefore, vim is
twice as perfect as vi, plus a couple extra bits of goodies. :-)

That is, vim is better than perfect.
– Nathan T. Oelger

Have an awesome 2009!

New mail notification with Procmail

Problem: I have my local mail setup with mutt, fetchmail and procmail. And I would need a way to run an arbitrary script when new mail arrives and _not_ loose the mail from my inbox.

Solution: Procmail supports a nice feature called nesting. With nested blocks a procmail recipe can be assigned more than action.
e.g: Here’s a recipe to filter all my email from gmail:
:0:
* ^To:.*gmail\.com
{
:0 c
$MAILDIR

:0 h
| grep “From:” | ~/src/_scripts/ratpoison/newmail.sh
}

And the newmail.sh script
#!/bin/bash
read from
ratpoison -c "echo Mail $from"

Discussion: Here’s the breakup

  • *^To:*gmail.com Catch all mails which have their To: field containing gmail.com
  • Nesting: specify the actions inside { .. }
  • :0 c - Copy the mail into $MAILDIR
  • :0 h – Pipe the header into grep. And pipe the output (“From:*”) into newmail script
  • In newmail.sh script, read the piped content into variable $from and display that in ratpoison message bar

If your window manager has a notification tray, you may be interested in this. New to procmail, go here.

/me is having bit overdose of cookbooks these days ;)

[Update] Download youtube videos in linux

I had posted long back about a bash wrapper over youtube-dl to fetch youtube videos using your own download manager. The youtube-dl script had changed long since then. And good news is now we don’t need to change the script in any way to get the download url. So here we go!

Grab the latest youtube-dl from here
Minor update to the linux youtube-dl script:
#!/bin/bash
if [ $# -ne 2 ]
then
echo “Usage : $0 ”
echo “e.g : $0 http://www.youtube.com/watch?v=D1R-jKKp3NA steve_jobs”
else
outputfile=”.avi”
todnload=$(youtube-dl -g $1) # youtube-dl is in path, right?
echo “Got the file..”$todnload
axel -n 100 $todnload -o $2″.flv” # wget -c $todnload -O $2″.flv” .if you don’t use axel
echo “Download Completed…”
ffmpeg -i $2″.flv” $2$outputfile # get the avi file
fi

Enjoy :)
Thanks Jay for pointing this out!

How to choose a web host

Lemme shout it aloud: I am a big fan of free webhosts :) Until recently, one of my friends made me realize(brainwashed!) that probably I had had my part of the freebies :P And its been long since I last did something php-ish, so the thought of getting hands dirty with web programming made me take this bad decision of getting a host.
And dude, choosing the right host _is_ much tougher than expected. It ate up some 10s of my sleep-hours and a few man-weekends :( Finally, yesterday I gave up(damn it! set IDontCare=true) on my FindBestHost greedy approach and purchased a hosting package.
A few suggestions for those looking out for webhosts:

  1. Apply filters
    Define your purpose. Why you need a web host? What kind of websites you plan to host? Any special requirements? How much resources(space/bandwidth) do you think you’ll spend?
    As an example, here are my answers: I need a webhost to host a blog(pleaseeeee not one more!). Mostly the website will be personal and will host my dev work, some projects. Since I am a self-proclaimed web hobbyist, I would need flexibility in terms of scripts (RoR?). At max, I will put in 5-10 domains in the web space. Considering a GB per site, will take up 5GB and assuming a moderate crowd will use my web resources, lets allow 40GB per domain per month, that would make bandwidth around 200GB per month.
  2. The search begins!
    Use your favorite search engine. Ask around. SetGTalkStatus/Twitter/Spam/Scrap/Write-on-wall your contacts. If they can refer, your hard job is done. Don’t waste time on more search. Jump to the next step.
    In case you’re shy to ask people, you can look in some good webhosting forums (e.g: WebHostingTalk). They’re good. Lots of webhost guys roam around there. I got around 14k results for a simple search query “recommend host” :P I think first 3-4 pages should suffice. This is a good starting place as well.
  3. Test the host!
    At this time, we should have a shortlist of 4-5 hosts. Now for each host try this:

    • Go to Webhosting.info. Search for the webhost. Have a good look at the number of domains coming into the host and going out each week. You can country wise breakup of top hosts. For me the most important criteria was domain in/out analysis and total domains hosted.
    • Do a “You suck” test. Query your favorite seach engine for negative reviews of the host. E.g: Go to your favorite search engine. Query for “[webhost] sucks”. Safely ignore the [Web Host Name]reviews.com(e.g: [TheBigCheapAndBestWebHost]Reviews.com) websites. A good choice would be query in Blog Search.
    • Consider having a chat with the sales/support people at the webhosting live chat. Sometimes the time they make you wait will immediately give you a feel of their customer service :)
  4. Beware of these!
    • The “foobar” offers. Before you sell out cash on *any* offer that gives you substantial discounts, make sure you’re aware of their normal(non-foobar) pricing. Unless you plan to change host every billing cycle, you will be charged against the normal pricing. E.g: A webhost offers 4.95USD as foobar price and normal pricing is 7.95USD.
      And I never understood the rationale behind the *.95USD pricing rule.
    • Overselling. I was really astounded at the number of companies offering unlimited resources. So I went around bugging their sales/support guys on this. They say you can’t store personal files(Does it translate to “all your files must be web accessible”?). I doubt if they have shared hosting customers who distributes legal 1gig files! I wish I could copy/paste the chats I had with these guys :P Clearly these people are overselling. They’ll put in your site in some server with probably a few hundred more poor domains. IMO unlimited bandwith can make sense for people who publish streaming content. But does the resource usage policy of these shared hosts allow you to stream enough?
      I would suggest to take a limited resource plan. I agree we all love greedy algos, but lets truly have an estimate of the resource usage before getting lured away on such offers.

Hope this might save a few minutes while deciding for a webhost. Good luck!