DavidCraddock.net

FREE Cloud Computing testbed for Python Apps

This is so cool.. Google are beta-testing a totally free hosting and cloud-computing resource called Google App Engine. The caveat is that your hosted app must be written in Python. Python is amazing anyway, and if you don’t know it, now is the perfect time to learn. Check this out for more information about Google App Engine: http://code.google.com/appengine/docs/whatisgoogleappengine.html They’re giving away a very generous 500MB disk space and enough processing power to serve 5 million pages a month. Awesome!

Bacula Scheduling

Bacula is a great open-source distributed backup program for Linux/UNIX systems. It is separated into three main components: One ‘Director’ - which sends messages to the other components and co-ordinates the backup One or more ‘File Demons’ - which ‘pull’ the data from the host they are installed from. One or more ‘Storage Demons’ - which ‘push’ the data taken from the file demons into a type of archival storage, IE: backup tapes, a backup hard disc, etc I found it extremely versatile yet very complicated to configure. Before you configure it you have to decide on a backup strategy; what you want to backup, why you want to back it up, how often you want to back it up, and how you are going to off-site/preserve the backups. ...

Linux under Hyper-V

This is an overview of current Linux support under Hyper-V, the free Windows Server 2008 virtualisation product. As you probably know, virtual servers allow the emulation of hardware in software. So you have a single physical ‘virtual server’. This virtual server emulates the physical hardware for several ‘virtual machines’ which sit on top of the virtual server. As far as the operating system on the virtual machine is concerned, it doesn’t notice anything different at all - it thinks it is running on a full set of dedicated hardware. However in reality, the virtual server is sharing its real physical resouces amongst the collection of virtual machines, assigning for example - 3GB of its memory to virtual machine A, and 1GB to virtual machine B. ...

Stanford Engineering for Everyone

The Stanford engineering department, often regarded as the best in the world for computer science education, has made its core CS curriculum free for anyone with an internet connection. There are some catches, ie: you don’t get your assignments marked, you have no contact with the lecturer, but all the same, it is really a great resource. The material is very high-quality, professionally filmed lectures and a full compliment of handouts and course notes. It also does not even assume knowledge of programming - it teaches you right from the basics. ...

Automated Emails on Commiting to a Subversion Repository Using Python

At work I’ve written a couple of scripts that send out emails to the appropriate project team when someone checks in a commit to the project subversion repository. Here are the details. Firstly, you will need a subversion hook setup on post-commit. The post-commit hook needs to be located in SVNROOT/YOURPROJECT/hooks where YOURPROJECT is your svn project name, and SVNROOT is the root directory where you are storing the data files for your subversion repository. ...

Scraping Wikipedia Information for music artists, Part 2

I’ve abandoned the previous Wikipedia scraping approach for Brightonsound.com, as it was unreliable and didn’t pinpoint the right Wikipedia entry - ie: a band called ‘Horses’ would pull up a Wikipedia bio on the animal - which doesn’t look very professional. So instead, I have used the Musicbrainz API to retrieve some information on the artist; the homepage URL, the correct Wikipedia entry, and any genres/terms the artist has been tagged with. ...

Character encoding fix with PHP, MySQL 5 and ubuntu-server

For some reason, under ubuntu-server, my default MySQL 5 character encoding was latin1. This caused no end of problems with grabbing data from the web, which was not necessarily in latin1 characterset. If you are ever in this situation, I suggest you handle everything as UTF-8. That means setting the following lines in my.cnf: [mysqld] .. default-character-set=utf8 skip-character-set-client-handshake If you already have tables in your database that you have created, and they have defaulted to the latin1 charset, you’ll be able to tell by looking at the mysqldump SQL: ...

Scraping artists bios off of Wikipedia

I’ve been hacking away at BrightonSound.com and I’ve been looking for a way of automatically sourcing biographical information from artists, so that visitors are presented with more information on the event. The Songbird media player plugin ‘mashTape’ draws upon a number of web services to grab artist bio, event listings, youtube vidoes and flickr pictures of the currently playing artist. I was reading through the mashTape code, and then found this posting by its developer, which helpfully provided the exact method I needed. ...

adExcellence Exam passed

I passed the adExcellence exam first time.. woo! It wasn’t that difficult really. “David Craddock of iCrossing is accredited as an official Microsoft adExcellence Member. A Microsoft adExcellence Member has completed comprehensive online training on managing Microsoft adCenter search engine marketing campaigns and has demonstrated expert knowledge by passing the Microsoft adExcellence accreditation exam.” As of 21/3/08, I’m somehow also now #1 on Google.co.uk for the keyword “adExcellence exam”.. if that’s what you googled for, you probably want the adExcellence main site instead. Or use Live Search.

Yahoo! Pipes

I have just seen Yahoo! Pipes , and am convinced this is going to change the web. For real. Data source sites will become ‘content providers’, data will be aggregated and filtered from multiple content providers, either by the user or by ‘intermediary’ sites. The user will be able to choose his ‘data view’ of the content on the internet, just as Google is currently doing. This is fascinating stuff if you’re involved in the web industry.