Selective mirroring …

Welcome back,

Today I shall post one of the solutions that really made me proud of myself, and the way I address issues, I called it selective mirroring, and my best friend calls it L2 Caching 🙂

Let’s describe the problem, we have two servers, and a library of 1.8TB of audio and video content, we want to offer downloads, and online streaming…

To cut the budget, we decided to use normal HTTP connections and servers to serve the streaming, so it’s not really streaming…

Put all the files on one server, and export it with NFS to the other server…

Run lighttpd on both servers, and you’re done, or are you???

As the number of hits to the streaming server increases, the network utilization goes sky-high, and the IO wait on the fileserver now reaches over 90% easily, CPU load on the fileserver goes up to 30, and the audio streams start to break, and sometimes totally stops …

You’d say, come on, get real !!!! we’re talking about two servers pushing over 300Mbit/Sec of traffic, this is really huge…

So, I came up with an idea, why not mirror the most active streaming requests locally to the streaming server, and serve the less frequently requests from NFS…

The final implementation had 3 stages to finally evolve to the current best-fit implementation…

Stage 1:

Evaluate the files on NAS server based on the file creation date, and copy the most recent files, create a new empty folder structure, and fill it with symbolic links, either to local files or remote NFS files, this had several limitations:

  1. Based on file creation date, this was totally in-accurate, because perhaps one of the most requested files is an old file; which turned out to be true for 90% of the top 10 requested files…
  2. This implementation only made it as a flow chart, and bash script implementation

image

Stage 2:

Obtain a list of the most requested files by analyzing the access logs, the best candidate for this job was AWStats, then just create a new empty folder structure, and fill it with symbolic links, either to local files or remote files, this had 2 severe limitations which came up while trying to run the script on the real server on the real files…

  1. Some files and folders has special characters, spaces, Arabic letters, this made the script go crazy and was unable to parse the files names correctly, leading to bad files and folders names, rendering them inaccessible to the public
  2. Newly added files won’t be available until the next time the script runs to create the symbolic link

The diagram for this stage looks pretty much the same as stage 1 except for the file evaluation part…

image

Stage 3:

Gather statistics about most frequently accessed files and copy them locally, and implement a dynamic module or program or anything that would check if the file exists locally or not, if it’s local, then it serves it locally, if not serve it from the NFS, this seemed like a module in lightty needs to be written in C

However, the I suddenly remembered reading about lightty can handle some special headers to allow PHP code to instruct lightty to serve a file and then free the PHP process for another request…

This suggested that there would be two stages for this implementation

Part 1 implemented by shell scripting with the aid of AWStats log files:

image 

Part 2 implemented using PHP + Lighttpd + mod_rewrite:

image

The shell script shall only be invoked from time to time, let’s say 30 days…

Part 2, is invoked every time a request for a file is received…

This gave instant effect, reduced load on the fileserver, reduced internal NFS traffic, improved end user experience = Happy servers, happy client, and certainly, happy system engineer 😀

Advertisements

One Response

  1. Hello!
    Very Interesting post! Thank you for such interesting resource!
    PS: Sorry for my bad english, I’v just started to learn this language 😉
    See you!
    Your, Raiul Baztepo

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: