Tag Archives: XML

New iGoogle Themes

I’ve been a regular iGoogle user for the last year and a half and the themes they added were a nice touch, but it gets even better: they have now released 5 more themes for you to try.

It’s a bit tricky to add them but not impossible:

All you have to do is go to iGoogle and type one of the following in the address bar: javascript:_dlsetp(’preview_skin=skins/planets.xml’); javascript:_dlsetp(’preview_skin=skins/autumns.xml’); javascript:_dlsetp(’preview_skin=skins/hongkong.xml’); javascript:_dlsetp(’preview_skin=skins/jr.xml’); javascript:_dlsetp(’preview_skin=skins/tiger.xml’); ..and then choose “Select a Theme For This Tab” and click on “Save”, but don’t select any theme.

[via Mashable]
  • Share/Bookmark

Music 2.0: Pitchfork

I love my music, I really do. Problem is however that I have far too much of it (2980 Songs, 16 GB currently, not counting audiobooks etc…) to carry it around with me all the time. It just doesn’t fit on my iPod Mini, nor does it fit on my notebook (I know shame on me, but I need the diskspace for work too). So I always have to carry around a selection of the songs I want to listen too, and far too often just the song I want to listen too is on my fileserver, and the trouble to transfer it to my notebook takes too long for me to be still in the mood to listen to it…
Pitchfork is the solution to all of my problems, well almost all… Pitchfork allows me to stream my music over my Local Network, and over the internet, with a nice and responsive interface that facilitates access to my songs.
Advertisement BadgeWhether you’re going to learn the electric guitar or if you would prefer to learn how to play the acoustic guitar instead you may want to go online and read some guitar reviews before buying one brand or another, from Ibanez guitars to many others.

Installing MPD

Since Pitchfork is based upon MPD, which will take care of streaming the data to the various clients and maintain the song database, we will first have to set it up. Sadly my distro (OpenSuSe 10.2) doesn’t have a version of MPD in its package management, so I had to download the sources and compile it myself:
wget http://www.musicpd.org/uploads/files/mpd-0.12.2.tar.bz2
tar -xvjf mpd-0.12.2.tar.bz2
cd mpd-0.12.2/
Now we have the sources, and we have to check that we have all the dependencies. Depending on how you are going to use MPD you have to have the following packages (along with their devel packages):
  • libshout: because we will be streaming the songs to an Icecast server
  • oss: as far as I know this is the most widely used and best supported. If it doesnt work take a look here
  • libmad: for mp3 support
  • libogg: for OGG Vorbis support
  • libvorbis: for Metadata support for Vorbis formats
  • libid3tag: Metadata support for mp3.
Quite a long list, but this is about the most essential stuff, if you have some special needs take a look at the official dependency listing.
Now it’s time to get to the actual compilation:
./configure
make
sudo make install
After the configure step, you’ll see a list of options that are turned on or off, if something is missing it probably means that you are missing some development packages.
So now we have mpd installed, the next step is to configure it. First you have to move the example configuration to the correct location:
sudo cp doc/mpdconf.example /etc/mpd.conf
Then we edit it according to our needs, especially the location of the Songs, Logfiles and Database should be changed: music_directory "/storage/Music/" playlist_directory "/storage/Music/playlists" db_file "/var/lib/mpd.db" log_file "/var/log/mpd.log" error_file "/var/log/mpd.error" pid_file "/var/run/mpd.pid" The pid_file has to be uncommented since we will run MPD in daemon mode, and we want to shut it down later. Now let’s create these files and change the owners:
sudo touch /var/lib/mpd.db /var/log/mpd.log /var/log/mpd.error /var/run/mpd.pid
sudo chown mpd /var/lib/mpd.db /var/log/mpd.log /var/log/mpd.error /var/run/mpd.pid
“mpd” being the user that will later run the daemon (if you don’t have an mpd user refer to your distro documentation on how to create one. It is really important that the mpd user is in the sour). Next we have to set some more settings:
user                            "mpd"
audio_output {
        type                    "oss"
        name                    "Direct OSS output"
}
audio_output {
        type                    "shout"
        name                    "Icecast Stream"
        host                    "localhost"
        port                    "8000"
        mount                   "/mpd.ogg"
        password                "youricecastpassword"
        bitrate                 "96"
        format                  "44100:16:1"
}
mixer_type                      "oss"
mixer_device                    "/dev/mixer"
mixer_control                   "PCM"
mixer_type                      "software"

Initializing the database:

mpd --create-db
This will insert all your songs into the database for faster access. Now MPD is set up and should work. Try with
sudo mpd
MPD will complain that it can’t connect to the Icecast server (which we’ll setup a few lines below this, but everything else should work fine.

Installing Icecast

As I said, my Music resides on a fileserver, and I want the songs to be streamed directly to my other machines, so what’s better than Icecast, an opensource streaming solution, used by many internet radios out in the wild. Luckily most distros have it in their packagemanagement software (I know openSuSe has :D ) so it shouldn’t be too hard to get it up and running, just make sure that the source password in /etc/icecast.xml matches the password you specified above, in the /etc/mpd.conf file.

Installing Pitchfork

Installing pitchfork is pretty straight forward. All you need is this:
  • PHP 5.1.3 or newer
  • PHP-Pear
  • DOM2 capable browser (firefox, konqueror, opera, safari, etc)
Please refer to your distro documentation to see how to setup these. My setup uses an apache 2 server with mod_php, and Firefox 2 on my notebook. Now to the installation itself, depending on your distribution the document root of your webserver is /var/www/htdocs or /srv/www/htdocs, change to that directory and do the following:
wget http://pitchfork.remiss.org/files/pitchfork-0.5.2.tar.bz2
tar -xvjf pitchfork-0.5.2.tar.bz2
cd pitchfork-0.5.2/
chmod a+rwx config/
The rest of the configuration is done through the browser, just point your browser to http://localhost/pitchfork-0.5.2/ if you installed pitchfork on the computer you’re currently on, or replace localhot with the IP of your server. Change whatever you’d like to change (but it should not be necessary as the important configuration is done in MPD itself), and then press Save and you’ll be taken to the Pitchfork interface. Voila, it’s all done :D

Installing mpdscribble (optional)

I’m a huge fan of Last.fm, but sadly I can’t update my profile using the Player, because the songs are streamed to it. So why not use the MPD to update them for me? For this purpose I will be using mpdscribble:
wget http://www.frob.nl/projects/scribble/mpdscribble-0.2.12.tar.gz
tar -xvzf mpdscribble-0.2.12.tar.gz
cd mpdscribble-0.2.12
./configure
make 
sudo make install
Now create the basic configuration
mkdir ~/.mpdscribble
It is suggested that mpdscribble is run as your user since it has read access to your username and password-hash. Edit the configuration file ~/.mpdscribble/mpdscribble.conf as follows: username = "lastfm_username" password = "md5sum of lastfm password" Where the md5sum can be found using either this tool or the command line md5sum tool:
echo -n "your_lastfm_password" | md5sum
And that’s it, mpdscribble is set up and ready to be run:
mpdscribble &

Further reading

cwna, Certified Wireless Network Administrator certification exams introduce new technologies of wireless networking system. ccnp boot camp offers all kinds of troubleshooting and booting techniques to tackle any systematic faults. The IT professionals are guided by giving mcp training in the effective manners. If you want more validate information for supporting and troubleshooting of computing system, a+ certification is the best IT certification in order to provide authentic training for IT professionals. The most valuable and measurable rewards are provided by cisco certification which endows perfect skills and training to the professionals of IT industry. The network+ certifications give full knowledge and information which has great significance for networking administrators.
  • Share/Bookmark

Google Sitemaps

Last year I was excited when Google announced the release of their Sitemaps Protocol which helps Searchengines to index content more efficiently.
Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site. Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.
And today Microsoft and Yahoo are jumping on the train too, they announced today that they will be supporting the protocol. More information at sitemaps.org.
  • Share/Bookmark

Fjax? What’s that about?

Ajaxian has a post covering the much hyped Fjax alternative to Ajax. Off course the major flaw I see in Fjax is the F (which is the only “new” thing in Fjax anyway), Flash. I personally hate Flash, not because it’s not powerfull and you can’t do a lot of things with it, but let’s be honest, Flash scales horribly. The point of Ajax becoming so popular is the fact that it required no extra plugins, add-ons or whatever you may call it, it just runs out of the box. If now we switch over to a Flash based solution we loose one of the major drivers in Ajax’s short history. As Ajaxian puts it:
Jay and Steve McDonald didn’t like traditional Ajax (the libraries, the XML parsing (even though you can use JSON of course)) and decided that they could create “smoother, more desktop-like web experiences that AJAX promises”, as they said in an interview at juxtaviews.com. What is Fjax? Website: “Fjax is an open, lightweight, cross-browser methodology for Ajax-style web 2.0 development Fjax is a technique focused on drastically streamlining the XML handling layer of web 2.0 applications. Picture Ajax’s XML parsing and handling with less than 65 lines of code! It’s not a replacement for toolsets that provide presentation-layer visual gizmos. Think of it as a new engine to put under the hood of all the great widgets that are already out there.”
Now, the image I draw may be a bit too dark, but we are moving away from all the REST-Architecture, service modelling, and other great things Ajax has brought to use, exaclty like the comet stuff some months ago. I’ll give Fjax a chance to prove itself but it is has no killer feature that might justify the hype about it right now.
  • Share/Bookmark

Optimizing Page Load Time

Aaron Hopkins of Google has released an article on Optimizing Page Load Time which came out of his experience optimizing page load times for a high-profile Ajax application. Clearly Google developers are along the best known ones in the Web 2.0 movement and the tips Aaron gives might just squeeze out the last few milliseconds from your application and improve the user experience. He makes a really extensive discussion on both problem sources and their solutions:
  • Turn on HTTP keepalives for external objects. Otherwise you add an extra round-trip for every HTTP request. If you are worried about hitting global server connection limits, set the keepalive timeout to something short, like 5-10 seconds. Also look into serving your static content from a different webserver than your dynamic content. Having thousands of connections open to a stripped down static file webserver can happen in like 10 megs of RAM total, whereas your main webserver might easily eat 10 megs of RAM per connection.

  • Load fewer external objects. Figure out how to globally reference the same one or two javascript files and one or two external stylesheets instead of many; try preprocessing them when you publish them. If your UI uses dozens of tiny GIFs all over the place, consider switching to CSS, which tends to not need so many of these.

  • If your users regularly load a dozen or more uncached or uncacheable objects per page, consider evenly spreading those objects over four hostnames. This usually means your users can have 4x as many outstanding connections to you. Without HTTP pipelining, this results in their average latency dropping to about 1/4 of what it was before.

    When you generate a page, evenly spreading your images over four hostnames is most easily done with a hash function, like MD5. Rather than having all <img> tags load objects from http://static.example.com/, create four hostnames (e.g. static0.example.com, static1.example.com, static2.example.com, static3.example.com) and use two bits from an MD5 of the image path to choose which of the four hosts you reference in the <img> tag. Make sure all pages consistently reference the same hostname for the same image URL, or you’ll end up defeating caching.

    Beware that each additional hostname adds the overhead of an extra DNS lookup and an extra TCP three-way handshake. If your users have pipelining enabled or a given page loads fewer than around a dozen objects, they will see no benefit from the increased concurrency and the site may actually load more slowly. The benefits only become apparent on pages with larger numbers of objects. Be sure to measure the difference seen by your users if you implement this.

  • Possibly the best thing you can do to speed up pages for repeat visitors is to allow static images, stylesheets, and javascript to be unconditionally cached by the browser. This won’t help the first page load for a new user, but can substantially speed up subsequent ones.

    Set an Expires header on everything you can, with a date days or even months into the future. This tells the browser it is okay to not revalidate on every request, which can add latency of at least one round-trip per object per page load for no reason.

    Instead of relying on the browser to revalidate its cache, if you change an object, change its URL. One simple way to do this for static objects if you have staged pushes is to have the push process create a new directory named by the build number, and teach your site to always reference objects out of the current build’s base URL. (Instead of <img src=”http://example.com/logo.gif”> you’d use <img src=”http://example.com/build/1234/logo.gif”>. When you do another build next week, all references change to <img src=”http://example.com/build/1235/logo.gif”>.) This also nicely solves problems with browsers sometimes caching things longer than they should – since the URL changed, they think it is a completely different object.

    If you conditionally gzip HTML, javascript, or CSS, you probably want to add a “Cache-Control: private” if you set an Expires header. This will prevent problems with caching by proxies that won’t understand that your gzipped content can’t be served to everyone. (The Vary header was designed to do this more elegantly, but you can’t use it because of IE brokenness.)

    For anything where you always serve the exact same content when given the same URL (e.g. static images), add “Cache-Control: public” to give proxies explicit permission to cache the result and serve it to different users. If a local cache has the content, it is likely to have much less latency than you; why not let it serve your static objects if it can?

    Avoid the use of query params in image URLs, etc. At least the Squid cache refuses to cache any URL containing a question mark by default. I’ve heard rumors that other things won’t cache those URLs at all, but I don’t have more information.

  • On pages where your users are often sent the exact same content over and over, such as your home page or RSS feeds, implementing conditional GETs can substantially improve response time and save server load and bandwidth in cases where the page hasn’t changed.

    When serving a static files (including HTML) off of disk, most webservers will generate Last-Modified and/or ETag reply headers for you and make use of the corresponding If-Modified-Since and/or If-None-Match mechanisms on requests. But as soon as you add server-side includes, dynamic templating, or have code generating your content as it is served, you are usually on your own to implement these.

    The idea is pretty simple: When you generate a page, you give the browser a little extra information about exactly what was on the page you sent. When the browser asks for the same page again, it gives you this information back. If it matches what you were going to send, you know that the browser already has a copy and send a much smaller 304 (Not Modified) reply instead of the contents of the page again. And if you are clever about what information you include in an ETag, you can usually skip the most expensive database queries that would’ve gone into generating the page.

  • Minimize HTTP request size. Often cookies are set domain-wide, which means they are also unnecessarily sent by the browser with every image request from within that domain. What might’ve been a 400 byte request for an image could easily turn into 1000 bytes or more once you add the cookie headers. If you have a lot of uncached or uncacheable objects per page and big, domain-wide cookies, consider using a separate domain to host static content, and be sure to never set any cookies in it.

  • Minimize HTTP response size by enabling gzip compression for HTML and XML for browsers that support it. For example, the 17k document you are reading takes 90ms of the full downstream bandwidth of a user on 1.5Mbit DSL. Or it will take 37ms when compressed to 6.8k. That’s 53ms off of the full page load time for a simple change. If your HTML is bigger and more redundant, you’ll see an even greater improvement.

    If you are brave, you could also try to figure out which set of browsers will handle compressed Javascript properly. (Hint: IE4 through IE6 asks for its javascript compressed, then breaks badly if you send it that way.) Or look into Javascript obfuscators that strip out whitespace, comments, etc and usually get it down to 1/3 to 1/2 its original size.

  • Consider locating your small objects (or a mirror or cache of them) closer to your users in terms of network latency. For larger sites with a global reach, either use a commercial Content Delivery Network, or add a colo within 50ms of 80% of your users and use one of the many available methods for routing user requests to your colo nearest them.

  • Regularly use your site from a realistic net connection. Convincing the web developers on my project to use a “slow proxy” that simulates bad DSL in New Zealand (768Kbit down, 128Kbit up, 250ms RTT, 1% packet loss) rather than the gig ethernet a few milliseconds from the servers in the U.S. was a huge win. We found and fixed a number of usability and functional problems very quickly.

    To implement the slow proxy, I used the netem and HTB kernel modules available in the Linux 2.6 kernel, both of which are set up with the tc command line tool. These offer the most accurate simulation I could find, but are definitely not for the faint of heart. I’ve not used them, but supposedly Tamper Data for Firefox, Fiddler for Windows, and Charles for OSX can all rate-limit and are probably easier to set up, but they may not simulate latency properly.

  • Use Google’s Load Time Analyzer extension for Firefox from a realistic net connection to see a graphical timeline of what it is doing during a page load. This shows where Firefox has to wait for one HTTP request to complete before starting the next one and how page load time increases with each object loaded. The Tamper Data extension can offer similar data in less easy to interpret form. And the Safari team offers a tip on a hidden feature in their browser that offers some timing data too.

    Or if you are familiar with the HTTP protocol and TCP/IP at the packet level, you can watch what is going on using tcpdump, ngrep, or ethereal. These tools are indispensible for all sorts of network debugging.

  • (Optional) Petition browser vendors to turn on HTTP pipelining by default on new browsers. Doing so will remove some of the need for these tricks and make much of the web feel much faster for the average user. (Firefox has this disabled supposedly because some proxies, some load balancers, and some versions of IIS choke on pipelined requests. But Opera has found sufficient workarounds to enable pipelining by default. Why can’t other browsers do similarly?)

The last tip in my opinion is not a really good one, Browser developers will most likely just ignore you, and you wont be able to change the world. For the full analysis, with graphics and the full discussion please see the entry by Aaron, it’s definitely one of the best readings in the last months, with all the technical details :)
[via Ajaxian]
  • Share/Bookmark