May 31, 2011

C# HTML Minification

Questions:
- Is there any code which input the html source of a page and it will minify the code?
- A solution for Text Compress without used of GZip for...
- Minifies given HTML or XML source by removing extra whitespaces, comments and other unneeded Characters without breaking the content structure
- Minify HTML (or XHTML), and any CSS or JS included in your markup
- Compressing and Minifying HTML Markups

Solution:
I hope this helps.
/// Solution A
html = Regex.Replace(html, @"\n|\t", " ");
html = Regex.Replace(html, @">\s+<", "><").Trim();
html = Regex.Replace(html, @"\s{2,}", " ");

/// Solution B
html = Regex.Replace(html, @"(?<=[^])\t{2,}|(?<=[>])\s{2,}(?=[<])|(?<=[>])\s{2,11}(?=[<])|(?=[\n])\s{2,}", "");
html = Regex.Replace(html, @"[ \f\r\t\v]?([\n\xFE\xFF/{}[\];,<>*%&|^!~?:=])[\f\r\t\v]?", "$1");
html = html.Replace(";\n", ";");

/// Solution C
html = Regex.Replace(html, @"[a-zA-Z]+#", "#");
html = Regex.Replace(html, @"[\n\r]+\s*", string.Empty);
html = Regex.Replace(html, @"\s+", " ");
html = Regex.Replace(html, @"\s?([:,;{}])\s?", "$1");
html = html.Replace(";}", "}");
html = Regex.Replace(html, @"([\s:]0)(px|pt|%|em)", "$1");

/// Remove comments
html = Regex.Replace(html, @"/\*[\d\D]*?\*/", string.Empty);

References:
- Is there a better approach to minify html generated from aspx page
- Efficient stylesheet minification in C# 
- Follow up to Additional CSS minifying regex patterns
- Improve ASP.NET Performance - CSSmin

No comments: