This Site Powered by FileMaker and FX.php 
FMWebschoolFMWebschool Site MapFMWebschool Site Map
FileMaker Web DevelopmentFileMaker buttonsFileMaker plug-insFMWebschool VideosFMWebschool productsFMWebschool FileMaker servicesFMWebschool FileMaker specialsFileMaker ResourcesFMWebschool forum

FMWebschool

Clean HTML Code - Save Money for You and Your Users

April 18th, 2007 by Michael Petrov

Generic web statistics, using this post you can lower the red bars
We all often see pages that have a lot of HTML code that is messy, unoptimized, and looks horrible. However this post will not discuss or complain about the style of code - or whether it compares to MySpace or Geocities. I will address the sole consequences of decreasing its size during your project design with very little work and no rewriting of the current code structure. I hope that after reading this article, you will have an urge to run and clean up the code to make your website faster, your users happier, and your hosting company out of your hair!

Let’s take a step by step look of what causes oversized HTML code:

  • Extra whitespace in the beginning and end of your lines, often useful for readability of indented code during design
  • extra spaces and no content between tags, browsers treat any number of consecutive spaces as a single space unless they are within a <pre></pre> block

Consequences of oversized HTML code:

  • Increased page loading time, often processing will stop to flush the content printed so far to the user
  • Increased waiting time for users, largely due to the time it takes to download the HTML code - while most of us do not have dial up anymore, 10KB can still make a noticeable difference for a web page
  • Increased server load, connections that are kept longer due to the reasons above slow down the whole server allowing less users to be connected at the same time
  • Increased loading time for the users since the browser has to parse more code (very minor factor, but our attention spans are so short that even a few milliseconds can occasionally make a difference between a sale for you or your competitor)

Cleaning Up Excess HTML Code - Automatic and Painless

As promised in the beginning of the article, we will not suggest that you rewrite your HTML structure and take hours adjusting everything - however doing so can be beneficial in the long run (such as converting tables into floating divs, will be covered in a future article). I would like to present a simple method to clean up your HTML code in a post-processing step, allowing you to keep all the whitespace that you use during design time for organization!

  • Step 1: turn on output buffering
    • Add <?php ob_start(); ?> to the first line of your code, must come before any output to the browser is sent
    • This will not print the output to the user, instead buffering it in an internal variable
  • Step 2: define the cleanup function
    • Add the following code to define the cleanHTML($content) function:
    • This should be added after all your content processing and output functions
    • function cleanHTML($content) {
      $start = strpos(strtolower($content),’<html’); // Find the position after the <!DOCTYPE>
      $top = substr($content,0,$start); // Extract the doctype from our processing function
      $content = substr($content,$start); // Extract the content without the doctype for processing
      $content = preg_replace(’#^[\ \t\n\r]+#m’,”,$content); // Remove any leading whitespace at the beginning of lines
      $content = preg_replace(’#>[\n\r]+<#’,'><’,$content); // Remove linebreaks that separate tags
      $content = preg_replace(’#>[\t\ ]+<#’,'> <’,$content); // Preserve a single space with multiple spaces
      return $top.$content;
      }
  • Fetch the output buffer and output the clean version of it
    • $content = ob_get_clean(); // Returns the output buffer and cleans it from the buffer variable
    • echo cleanHTML($content); //Outputs the clean HTML content to the user
  • Warnings, Limitations, and Improvement Ideas
    • This code will generally not work well within <pre></pre> tags since they render spaces literally. The code above can be improved at the cost of extra processing to ignore data within <pre> tags, please feel free to make this modification and post about it in the comments.
    • You can define the cleanHTML function in a common include file such as the server_data.php file for FX.php users or any other file that is processed on every page that you want to clean up

Concluding Thoughts and Ideas

We hope this quick article will be useful for all of you when designing sites in the future, it requires minimal effort to save valuable bandwidth for your users and allow more users to access your server. Attached to this post is a sample file that uses the HTML code for our Blackbelt CDML Engine Demo Site and applies the technique above to clean it up, a few extra lines were added to measure the impact of the HTML cleanup - feel free to use the measuring code to see how well it works on your site (6.83% of savings in our case!).

Thanks for checking out this post - if you like the content, don’t forget to subscribe to this blog using your favorite RSS reader!

- Michael Petrov

Download sample file

Leave a Reply

Site Design by FMWebschool - Copyright © FMWebschool 2005 - Powered by FX.php and FileMaker
FileMaker and the FileMaker logo are registered trademarks of FileMaker, Inc
All other trademarks and copyrights are the property of their respective owners.