file modification time

To get the access|modification|change time of a file, one could use stat to get the data. For example:

$ stat -c %y ~/.bash_aliases 2010-02-20 17:12:31.161203715 +0100

Of course, this can be easily cutted to get only the date:

$ stat -c %y ~/.bash_aliases | cut -d' ' -f1 2010-02-20

make arch ftw!

I really like using Makefiles to package my projects into .tar.gz files. And I really like those .tar.gz where all the content is contained in a single top-level directory, named exactly as the archive file, so it's easy to delete all files easily when extracted so a crowded directory.

This can be easily achieved using something like this:

TITLE = music-quality-$(shell date +%Y%m%d)

arch:
 tar cvzf $(TITLE).tar.gz \
  --transform 's#^#$(TITLE)/#' \
  --exclude '*~' \
  --exclude plot* \
  --exclude feature_pres \
  src/*.m

.PHONY: arch

multiline sed

Given you have a PostScript file that contains blocks like the following, but with different commands repeatedly:

gsave
/Courier findfont 10 scalefont ISOEncode setfont
0.000 0.000 0.000 setrgbcolor AdjustColor
202 2.8421709430404e-14 [
[(0.4)]
] 12 -0.5 1 0 false DrawText
or

gsave
/Helvetica findfont 10 scalefont ISOEncode setfont
0.000 0.000 0.000 setrgbcolor AdjustColor
10 240 [
[(File: file1.wav   Page: 1 of 3   Printed: Fri Feb 19 18:32:26)]
] 14 -0 0 0 false DrawText
grestore

and you want to remove those blocks that match '(File' in the fifth line of input, you could try this: sed -e '/gsave/ { N; N; N; N; /(File/ { N; N; d; } }'

server backup script

My Server creates daily backups incremental backups to monthly full backups. It will keep the last three daily backups, as well as the last three monthly backups.

#!/bin/bash

# mru, feb 2010

# make backups.
#


DATE_D=$(date "+%Y%m%d")
DATE_M=$(date "+%Y%m")

BACKUPS=/var/backups
TARGET_D=${BACKUPS}/daily_backup_${DATE_D}.tar.gz
TARGET_M=${BACKUPS}/monthly_backup_${DATE_M}.tar.gz
RECEIVER=mru
TAROPTS=" --preserve-permission --label=$DATE_D"

FILES='
root
home
etc
boot
var/trac
var/www
var/mail
var/repositories
'

EXCLUDE_CLAUSE="
--exclude *~
--exclude ${BACKUPS}
--exclude root/.cpan
--exclude root/caldav
--exclude root/tmp
"

trap '{
 echo \"Failed. Stopping.\";
 exit 1;
 if [ -f "$TARGET" ]; then
   rm $TARGET;
 fi
}' ERR

# remove old files

OLD_D=$(find $BACKUPS -name daily_backup_????????.tar.gz | sort -n | head -n -3)
OLD_M=$(find $BACKUPS -name monthly_backup_??????.tar.gz | sort -n | head -n -3)
OLD=${OLD_M}${OLD_D}
echo "$OLD" | xargs --no-run-if-empty rm

# make monthly spinoff

if [ -f $TARGET_M ]; then
    TAROPTS+=" --newer $TARGET_M"
    TARGET=$TARGET_D
else
    TARGET=$TARGET_M
fi


# create archive
# chdir to / to get the right globbing
( cd / && tar czf "$TARGET" $TAROPTS $EXCLUDE_CLAUSE $FILES )


chmod og-r "$TARGET"

# send report

mail -s "Backup Message for ${HOSTNAME} $(date)" ${RECEIVER} <<EOF
Backup created.

Filename: $TARGET
size:     $(ls -lah $TARGET)

$(if [ -n "$OLD" ]; then echo "==== removed these files: $OLD"; fi)

==== archive content $TARGET:
$(tar tzf $TARGET)

==== folder content $BACKUPS:
$(find ${BACKUPS})
EOF

server status email

My server sends me an email about it's status every day:

Key Fingerprint to Wikimedia Commons Image

It's been some time, since my last serious python code, so excuse the mess ;) This will take any key fingerprint, for e.g. from ssh, in a hexadicaml format and show an image from commons.wikimedia.org which represents the key. The script will hash the key and does some sort of table lookup. I still need to implement some double hashing, when the matching image has a too high time offset, to eliminate some collisions. The current implementation also can't really deal with deleted images, tho I came up with some workarounds to that already.

#!/usr/bin/python

# convert ssh key figerprint to wikimedia commons image

import os
import sys
import commands

from datetime import datetime
from datetime import timedelta

import re

from xml.sax import saxutils
from xml.sax import make_parser
from xml.sax.handler import feature_namespaces


def convertMonth(ds):
    """ Replaces all written out months in a string with a xx number."""
    # don't depend on locals
    ds = ds.replace("January", "01") 
    ds = ds.replace("February", "02") 
    ds = ds.replace("March", "03") 
    ds = ds.replace("April", "04") 
    ds = ds.replace("May", "05") 
    ds = ds.replace("June", "06") 
    ds = ds.replace("July", "07") 
    ds = ds.replace("August", "08") 
    ds = ds.replace("September", "09") 
    ds = ds.replace("October", "10") 
    ds = ds.replace("November", "11") 
    ds = ds.replace("December", "12") 
    return ds



class ImageParser(saxutils.DefaultHandler):

    def __init__(self):
       self.inThumb = False
       self.dateElement = False
       self.thumbs = []

    def startElement(self, name, attrs):

        if self.dateElement:
            self.dateElement = False

        if name == 'div':
            if attrs.get('class', None) == 'thumb':
                self.inThumb = True

        if self.inThumb:
            if name == 'img':
                self.thumbs.append([attrs.get('src', None)])
            if name == 'i':
                self.dateElement = True 
                self.inThumb = False

    def characters(self, content):
        if self.dateElement:
            date = datetime.strptime(convertMonth(content), "%H:%M, %d %m %Y")
            self.thumbs[len(self.thumbs)-1].append(date)

    def endDocument(self):
        self.thumbList = []
        for l in self.thumbs:
            if l[0].find('http') != -1:
                if len(l) == 2:
                    self.thumbList.append(l)


class HashFunction:

    def __init__(self, startDate, duration, ticSize):
        """ Provides a Hash function, which takes any integer
        and returns a date within a given interval.

        Keyword Arguments:
        startDate -- the maximum date in the time interval, a datetime object
        duration  -- the positiv length of the timer interval, a deltatime object
        ticSize   -- the tic size in seconds is the unit in which the interval is divided
        """
        self.startDate = startDate
        self.duration = duration
        self.ticSize = ticSize

        seconds = duration.days*24*60*60 + duration.seconds
        self.numHashes = seconds / ticSize

    def h(self, x):
        """ Converts an Integer to a date in the configured period,
        based on a hashfunction.
        """
        h1 = x%self.numHashes
        second = h1 * self.ticSize
        return self.startDate - timedelta(0, second)
        
    def hFormat(self, x):
        """ Converts an Integer to a date in the configured period,
        based on a hashfunction. The date is formated string.
        """
        dt = self.h(x)
        return dt.strftime("%Y%m%d%H%M%S")

        
class KeyImage:

    def __init__(self):
        self.hf = HashFunction(startDate, duration, resolution)

    def __init__(self, startDate=datetime(2010, 2, 1, 12, 00, 00), duration=timedelta(days=365*4), resolution=60):
        self.hf = HashFunction(startDate, duration, resolution)

    def getImage(self, x, size=-1):
        """ Converts an Integer value to an image url from wikimedia commons."""
        # hasing 
        imageDate = self.hf.hFormat(x)

        # download html file
        url = "'http://commons.wikimedia.org/w/index.php?title=Special:NewFiles&from=" + imageDate + "'"
        htmlFilename = "/tmp/commons_thumbs.html"
        commands.getstatusoutput("wget -E " + url + " -O " + htmlFilename)

        # parse xml file
        parser = make_parser()                  
        ip = ImageParser()
        parser.setFeature(feature_namespaces, 0)
        parser.setContentHandler(ip)
        file = open(htmlFilename)
        parser.parse(file)

        # remove temporary html file
        commands.getstatusoutput("rm " + htmlFilename)

        imageUrl = ip.thumbList[0][0]
        if (size > -1):
            imageUrl = re.sub("[0-9]{2,3}px", str(size) + "px", imageUrl)

        return imageUrl
        

def main():
    skey = sys.argv[1].replace(".","").replace(":","")
    try:
        key = int(skey, 16)
    except:
        sys.stderr.write("[ERROR] Invalid Key")
        sys.exit(1)

    thumbSize = -1
    if len(sys.argv) >= 3:
        try:
            thumbSize = int(sys.argv[2])
        except:
            sys.stderr.write("[ERROR] Invalid Size")
            sys.exit(1)

    ki = KeyImage()
    print ki.getImage(key, thumbSize)

main()
Here is a short wrapper script, which downloads the image and displays it with feh.

#!/bin/bash

IMAGEURL=`key_to_thumb.py $1 $2|tail -1`
echo $IMAGEURL
TMPFILE="/tmp/kthumb.img"
wget "$IMAGEURL" -O$TMPFILE &> /dev/null
feh $TMPFILE
rm $TMPFILE

Tor Network Status

This will pull a csv file with data from all Tor nodes running in the network and print basic bandwith statistics. I wrote this script to estimate, if adding more non-exit nodes to the tor network could speed it up, or if the exit nodes bandwith is the bottle neck anyway.

#!/bin/bash

wget http://torstatus.kgprog.com/query_export.php/Tor_query_EXPORT.csv -P/tmp &> /dev/null

BANDWITH=`cat /tmp/Tor_query_EXPORT.csv |sed 1d|cut -d, -f3`
EXIT=`cat /tmp/Tor_query_EXPORT.csv |sed 1d|cut -d, -f10`
rm /tmp/Tor_query_EXPORT.csv

NUMNODES=0
NUMEXITS=0

BEXIT=0
BNOEXIT=0
BTOTAL=0

while read -u3 bandw; read -u4 exit 
do 
    BTOTAL=$(($BTOTAL+$bandw))
    NUMNODES=$(($NUMNODES+1))

    if [ $exit = '1' ]
    then
        BEXIT=$(($BEXIT+$bandw)) 
        NUMEXITS=$(($NUMEXITS+1))
    else
        BNOEXIT=$(($BNOEXIT+$bandw)) 
    fi
done 3<<<"$BANDWITH" 4<<<"$EXIT"

echo "Active Tor Routers:" $NUMNODES
echo "Exit Nodes" $NUMEXITS "("$(($(($NUMEXITS*100))/$NUMNODES))"%)"
echo "Bandwith Exits: " $BEXIT "kB/s" "("$(($BEXIT/1024))" MB/s)"
echo "Bandwith rest:  " $BNOEXIT "kB/s" "("$(($BNOEXIT/1024))" MB/s)"
echo "Bandwith total: " $BTOTAL   "kB/s" "("$(($BTOTAL/1024))" MB/s)"

Multiple VPN Tunnels, Gentoo Style

Let tunname be the specific VPN tunnel to be started, using /etc/openvpn/tunname.conf as the config file. Then the required initscripts are created and started as follows:
# creating 
ln -s /etc/init.d/openvpn /etc/init.d/tunname

# starting 
/etc/init.d/tunname start

However, since starting the default initscript (/etc/init.d/openvpn) will cause a failure if no file named openvpn.conf is present, I'd strongly recommend to delete it from the default runlevel ;)

This is caused by the init script itself, as it determines the config solely by its name. Therefore the default initscript (/etc/ini.d/openvpn) will look for the aforementioned config file (/etc/openvpn/openvpn.conf) and report an error if it couldn't be found.

To start the tunnel on boot, the usual rc-update call will do the trick.

Cheers!

favorite man pages

I wondered what my favorite man pages would be. A little awk'ing shows it:

26 man bash
9 man find
8 man sed
8 man mplayer
8 man itoa
8 man dot
8 man cpio
7 man sleep
6 man read
6 man maxima
5 man xterm
5 man wget
5 man bc
5 man awk
4 man transcode
4 man sprintf
4 man script
4 man scp
4 man pushd
4 man popd
4 man ncftp
4 man latexmk
4 man gimp
4 man basename
3 man sudoers
3 man grep
3 man gcal
3 man cut
2 man wc
2 man units
2 man tee
2 man shutdown
2 man sh
2 man set
2 man pdfcrop
2 man paste
2 man netstat
2 man join
2 man cat
2 man cal
1 man wodim
1 man watch
1 man uname
1 man trap
1 man tar
1 man tail
1 man select
1 man quote
1 man profile
1 man patch
1 man nc
1 man mplayer 
1 man lsof
1 man ln
1 man hexdump
1 man head
1 man git-tag
1 man git-gui 
1 man git-commit
1 man git
1 man getopt
1 man fdevopen
1 man emacs
1 man display
1 man dijkstra
1 man diff
1 man convert
1 man col
1 man cmp
1 man ImageMagick

This is the output of:

awk '/^man / { cmds[$i] ++ } END { for (i in cmds) print cmds[i] " " i }' .bash_history | sort -nr

My bash history is configured as:

export HISTCONTROL=erasedups
export HISTSIZE=10000
shopt -s histappend

It would probably look a lot different if erasedups would not be set.