mod rewrite – Redirecting, modifying URLs or redirecting HTTP to HTTPS in Apache – Everything you've always wanted to know about Mod_Rewrite rules without being afraid to ask for it

mod_rewrite syntax order

mod_rewrite has specific ranking rules that affect the processing. Before anything is done, the RewriteEngine On directive must be given because this activates mod_rewrite processing. This should be before any other rewrite directive.

RewriteCond previous RewriteRule makes this rule conditional. All subsequent RewriteRules will be treated as if they were not subject to conditions.

RewriteEngine On
RewriteCond% {HTTP_REFERER} ^ https ?: // serverfault  .com (/ | $)
RewriteRule $ / blog / (. *) . Html $ / blog / $ 1.sf.html

In this simple case, if the HTTP referrer comes from serverfault.com, redirect blog requests to special serverfault pages (we're just special). However, if the above block had an additional RewriteRule line:

RewriteEngine On
RewriteCond% {HTTP_REFERER} ^ https ?: // serverfault  .com (/ | $)
RewriteRule $ / blog / (. *) . Html $ / blog / $ 1.sf.html
RewriteRule $ / blog / (. *) . Jpg $ / blog / $ 1.sf.jpg

All .jpg files would go to the serverfault special pages, not just those with a referent stating that it was coming from here. This is clearly not the intention of how these rules are written. This could be done with several RewriteCond rules:

RewriteEngine On
RewriteCond% {HTTP_REFERER} ^ https ?: // serverfault  .com (/ | $)
RewriteRule ^ / blog / (. *) . Html /blog/$1.sf.html
RewriteCond% {HTTP_REFERER} ^ https ?: // serverfault  .com (/ | $)
RewriteRule ^ / blog / (. *) . Jpg /blog/$1.sf.jpg

But should probably be done with a more delicate replacement syntax.

RewriteEngine On
RewriteCond% {HTTP_REFERER} ^ https ?: // serverfault  .com (/ | $)
RewriteRule ^ / blog / (. *) . (Html ​​| jpg) /blog/$1.sf.$2

The most complex RewriteRule contains the conditions for treatment. The last parenthesis, (html | jpg) tells RewriteRule to match for either html or jpg, and to represent the corresponding string as $ 2 in the rewritten string. This is logically identical to the previous block, with two RewriteCond / RewriteRule pairs, it simply does this on two lines instead of four.

Several RewriteCond lines are implicitly AND, and can be explicitly OR. To manage ServerFault and Super User (OU) referents:

RewriteEngine On
RewriteCond% {HTTP_REFERER} ^ https ?: // serverfault  .com (/ | $)    [OR]
RewriteCond% {HTTP_REFERER} ^ https ?: // superuser  .com (/ | $)
RewriteRule ^ / blog / (. *) . (Html ​​| jpg) /blog/$1.sf.$2

To serve ServerFault referenced pages with Chrome browsers (implicit ET):

RewriteEngine On
RewriteCond% {HTTP_REFERER} ^ https ?: // serverfault  .com (/ | $)
RewriteCond% {HTTP_USER_AGENT} ^ Mozilla. * Chrome. * $
RewriteRule ^ / blog / (. *) . (Html ​​| jpg) /blog/$1.sf.$2

RewriteBase is also specific to the order because it specifies how to follow RewriteRule the guidelines treat their treatment. This is very useful in .htaccess files. If used, it should work from the first directive under "RewriteEngine on" in a .htaccess file. Take this example:

RewriteEngine On
RewriteBase / blog
RewriteCond% {HTTP_REFERER} ^ https ?: // serverfault  .com (/ | $)
RewriteRule ^ (. *) . (Html ​​| jpg) $ 1sf. $ 2

This tells mod_rewrite that this particular URL being processed has arrived via http://example.com/blog/ instead of the physical directory path (/ home / $ username / public_html / blog) and to treat it accordingly. For this reason, the RewriteRule consider that the start of string is after the "/ blog" in the URL. Here is the same thing written in two different ways. One with RewriteBase, the other without:

RewriteEngine On

## Example 1: No RewriteBase ##
RewriteCond% {HTTP_REFERER} ^ https ?: // serverfault  .com (/ | $)
RewriteRule /home/assdr/public_html/blog/(.*).(html|jpg) $ 1sf. $ 2

## Example 2: with RewriteBase ##
RewriteBase / blog
RewriteCond% {HTTP_REFERER} ^ https ?: // serverfault  .com (/ | $)
RewriteRule ^ (. *) . (Html ​​| jpg) $ 1sf. $ 2

As you can see, RewriteBase allows rewrite rules to take advantage of the Web.site path to content rather than the webserver, which can make them more understandable for those who modify such files. In addition, they can shorten the guidelines, which has an aesthetic appeal.


RewriteRule Matching Syntax

RewriteRule itself has a complex syntax for matching strings. I will cover the flags (things like [PT]) in another section. As system administrators learn by example more often than by reading a manual page, I will give examples and explain what they do.

RewriteRule ^ / blog /(.*)$ / newblog / $ 1

the . * build is any single character (.) zero or more times (*). Putting it in parentheses tells him to supply the string that matched the $ 1 variable.

RewriteRule ^ / blog /.*/(.*)$ / newblog / $ 1

In this case, the first. * Was NOT included in parentheses and therefore is not provided to the rewritten string. This rule deletes a directory level on the new blog site. (/blog/2009/sample.html becomes /newblog/sample.html).

RewriteRule ^ / blog / (2008 | 2009) / (. *) $ / Newblog / $ 2

In this case, the first expression in parentheses defines a corresponding group. This becomes $ 1, which is not necessary and is not used in the rewritten string.

RewriteRule ^ / blog / (2008 | 2009) / (. *) $ / Newblog / $ 1/2 $

In this case, we use $ 1 in the rewritten string.

RewriteRule ^ / blog / (20[0-9][0-9]) / (. *) $ / newblog / $ 1 / $ 2

This rule uses a special hook syntax that specifies a character interval. [0-9] corresponds to the numbers 0 to 9. This specific rule applies to the years 2000 to 2099.

RewriteRule ^ / blog / (20[0-9]{2}) / (. *) $ / Newblog / $ 1 / $ 2

This is the same as the previous rule, but the {2} part tells it to match the preceding character (an expression in square brackets in this case) twice.

RewriteRule ^ / blog / ([0-9]{4}) / ([a-z]*) . html /newblog/$1/$2.shtml

This case will match any lowercase letter in the second matching expression and will do so for as many characters as possible. the . construct tells him to treat the period as a real period, and not as the special character that he is in the previous examples. This will break if the file name contains dashes.

RewriteRule ^ / blog / ([0-9]{4}) / ([-a-z]*) . html /newblog/$1/$2.shtml

This intercepts filenames containing dashes. However, as - is a special character in the expressions in square brackets, it must be the first character in the expression.

RewriteRule ^ / blog / ([0-9]{4}) / ([-0-9a-zA-Z]*) . html /newblog/$1/$2.shtml

This version intercepts any file name with letters, numbers, or the - character in the file name. Here's how to specify multiple sets of characters in an expression enclosed in square brackets.


RewriteRule Flags

Flags on rewrite rules have a host of special meanings and use cases.

RewriteRule ^ / blog / ([0-9]{4}) / ([-a-z]*).  html /newblog/$1/$2.shtml  [L]

The flag is the [L] at the end of the expression above. Several flags can be used, separated by a comma. Linked documentation describes each, but here they are:

The = Last. Stop the treatment of RewriteRules once it matches. The order counts!
C = String. Continue processing the next RewriteRule. If this rule does not match, the following rule will not be executed. More on that later.
E = Define the environmental variable. Apache has various environment variables that can affect the behavior of the Web server.
F = Not allowed. Returns a 403-Forbidden error if this rule matches.
g = Missing. Returns a 410-Gone error if this rule matches.
H = Manager. Force the request to be processed as if it were the specified MIME type.
NOT = Next. Force the rule to start over and match again. PAY ATTENTION! Loops can result.
North Carolina = No case. allows jpg to match both jpg and JPG.
BORN = No escape. Prevents the rewrite of special characters (.? # & Etc) to their hexadecimal code equivalents.
NS = No subrequests. If you use server-side inclusions, this will prevent matches with the included files.
P = Proxy. Force the rule to be manipulated by mod_proxy. Provide content seamlessly from other servers because your web server retrieves and restores it. This is a dangerous indicator because a badly written flag will turn your web server into an open proxy, which is bad.
PT = Pass through. Consider the Alias ​​statements in the RewriteRule match.
QSA = QSAppend. When the original string contains a query (http://example.com/thing?asp=foo), add the original query string to the rewritten string. Normally he would be thrown. Important for dynamic content.
R = Redirection. Provide HTTP redirection to the specified URL. Can also provide an exact redirection code [R=303]. Very similar to RedirectMatchwhich is faster and should be used where possible.
S = Skip. Skip this rule.
T = Type. Specify the mime type of the returned content. Very similar to AddType directive.

You know how I said that RewriteCond applies to one and only one rule? Well, you can get around that by chaining.

RewriteEngine On
RewriteCond% {HTTP_REFERER} ^ https ?: // serverfault  .com (/ | $)
RewriteRule ^ / blog / (. *) . Html /blog/$1.sf.html     [C]
RewriteRule ^ / blog / (. *) . Jpg /blog/$1.sf.jpg

The first rule Rewrite-rule being executed in the first rule RewriteRule, the second rule is executed when the first rule does, which corresponds to the correspondence of the previous rule RewriteCond. Very handy if Apache regular expressions hurt your brain. However, the all-in-one-line method that I point to in the first section is faster from the point of view of optimization.

RewriteRule ^ / blog / ([0-9]{4}) / ([-0-9a-zA-Z]*) . html /newblog/$1/$2.shtml

This can be simplified thanks to the flags:

RewriteRule ^ / blog / ([0-9]{4}) / ([-0-9a-z]*) . html /newblog/$1/$2.shtml   [NC]

In addition, some flags also apply to RewriteCond. In particular, NoCase.

RewriteCond% {HTTP_REFERER} ^ https ?: // serverfault  .com (/ | $)     [NC]

Will match "ServerFault.com"