Tying Things Together — An Introduction To Routing
So far we have been concerned with how to display some information to the user, but not with how the user finds the content we want to show. But as there are many ways to access data, there are several ways to describe where to find what — Websites and REST use URLs, SOAP uses the SOAPAction and so on, and then, the application itself organizes itself into modules and actions. Routing ties all of this together, it maps URLs to modules and actions for websites, extracting parameters as needed, handles XML-RPC and SOAP and even provides a way to use agavi on the commandline.
For each of these tasks exists a specialized subclass of AgaviRouting, this chapter will mainly explain the basics and the web-specific AgaviWebRouting that maps URLs to actions.
Routing Basics
Routing is executed very early in the request processing and can access all data known to the process at that time. The web routing can access headers, cookies and the URL to determine which action should be executed.
The routing uses rules based on regular expressions to match the input and extract parameters as it finds them. For more elaborate examination of input data routing callbacks can be used.
If no appropriated action is found, the configured Error404 action is set as default. The result of the routing execution is a Execution Container which is then executed by Agavi.
Routing Configuration
<?xml version="1.0" encoding="UTF-8"?>
<ae:configurations xmlns:ae="http://agavi.org/agavi/config/global/envelope/1.0" xmlns="http://agavi.org/agavi/config/parts/routing/1.0">
<ae:configuration>
<routes>
<!-- default action for "/" -->
<route name="index" pattern="^/$" module="Posts" action="%actions.default_action%" />
<route name="post" pattern="^/post$" module="Posts" action="Post.Show" />
<!-- an example for a CRUD-style set of routes -->
<route name="products" pattern="^/products" module="Products">
<!-- do not put the action into the parent route, because that one is not anchored at the end of the pattern! -->
<route name=".index" pattern="^$" action="Index" />
<route name=".latest" pattern="^/latest$" action="Latest" />
<route name=".create" pattern="^/add$" action="Add" />
<!-- "Product" is not an action, but just a folder with sub-actions. if only this route, without children, matches, then the action cannot be resolved and a 404 is shown - exactly what we want! -->
<route name=".product" pattern="^/(id:\d+)" action="Product">
<route name=".view" pattern="^$" action=".View" />
<route name=".edit" pattern="^/edit$" action=".Edit" />
<route name=".delete" pattern="^/delete$" action=".Delete" />
<!-- the gallery page is optional here, but the request parameter should not contain the leading slash, so our special syntax is in order -->
<route name=".gallery" pattern="^/gallery(/{page:\d+})?$" action=".Gallery">
<!-- assume the "1" by default and tell the routing what the rest of the string will look like when generating a URL -->
<default for="page">/{1}</default>
</route>
</route>
</route>
</routes>
</ae:configuration>
</ae:configurations>
The first two routes are the index route and the route we created
earlier in the guide. There's a couple of things you can notice on them:
- Both routes have a name attribute. The name is used to reference the route in your code when you need to generate a URL.
- Both routes have a pattern attribute. The pattern is applied against the input to check whether the route matches. An empty pattern always matches - to match the empty string use "^$".
- Both routes have a module attribute. It controls which module is selected if this route matches.
- Both routes have an action attribute. It controls which action is run if this route matches.
- An attribute that is not used here but still important enough to mention is "stop". It takes a boolean value and defaults to true. If set to false, matching does not stop after the route matches, instead the matched part is stripped from the input and matching continues.
These are a few, but probably the most important attributes for a route. We'll discuss the other attributes for a route later on.
Nesting Routes
Routes can be nested as you can see in the generated sample part of the routing.xml. Child-routes inherit the information set by the parent route and can overwrite it or append to it, depending on the type of information. It is not possible to append to any attribute other than name and action. Any value that starts with a dot (.) is appended to the parent value, so if the parent is named "products" and the child ".index" as in our routing file, the full name for the child route would be "products.index". Any parent route cannot have the "stop" attribute set to false.
The input for a child-route is the parent's input stripped by the part matched by the parent.
Routing Patterns
Routing uses special type of pattern matching language which is similar to PCRE regular expressions but simpler. Routing patterns are optimized for URL matching so that you do not have to escape common URL parts every time.
Just like with PCRE, a pattern can begin with a circumflex (^) and/or end with a dollar sign ($). By default, the pattern matches any part of the examined value. Use of these anchors forces the pattern to match only in the beginning and the end of the examined value respectively. So, "foo" matches any string that contains the word "foo", "^foo" matches only strings that begin with "foo" and "foo$" will match strings that end with "foo". Using both anchors means that the entire examined value must match the pattern.
The body of the pattern is a plain string which matches the examined value literally. To match parts with a regular expression, you have to enclose the relevant parts of the pattern in parentheses (called capture groups) which behave almost the same way as in true PCRE. Capture groups are also used to identify the parts of URL which become action parameters, as you can see in the products.product route. It defines an action parameter named "id" that consist of one or more digits. Capture groups may contain an optional prefix and/or postfix, in that case the contained regular expression must be enclosed in curly braces as seen in the products.product.gallery route.
We'll get to a more detailed description of routing patterns later.
How Routes Are Matched
The definitions in the routing.xml are processed top to bottom, the first matching route definition ends the processing unless it has its "stop" attribute set to false. If a parent route is matched, matching continues until all child routes have been tried and stops after that. When no matching definition is found, the configured Error404 page is displayed.
Organizing Routes For Best Performance
There are a few things to keep in mind when organizing routes.
- Non-stopping routes go to the place where they are required, mostly to the top.
- Avoid routes that match partial strings of other routes at the same level. This forces an order on routes that may be confusing - once you start shuffling things around, other things break. It also enforces an order that may be suboptimal for performance.
- Anchor patterns. Keep in mind that patterns that are anchored are way faster to match and that errors are less likely to occur.
- Reduce the number of first level routes. The less patterns need to be matched in a typical scenario, the better, so a deeper nesting is better than lots of first level branches.
- Move often calles routes to the top. The earlier a match is found, the better. Keep in mind that the index route ("/") can account for large percentages of your hits and is extremly fast to match. In most cases, it should be the first route in your routing.xml.
- Move routes that are quick to match to the top. This includes all short and simple patterns, pure string matches with no included parameters etc.
- Take care that your routes are anchored to the right as well. This prevents arbitrary URLs from being matched.
Let's go over the rest of the routing.xml and explain what each part does. It's a sample for a CRUD-style set of routes for a fictional product catalogue. The parent route is named "products" and just contains the pattern ^/products and a module defintion. If the rules are applied against any url that starts with "/products", the module is set to "Products" and the leading part is stripped from the input. If none of the child routes matches, the 404 page is displayed, because the action is not set. The products.index route has a pattern that matches the empty string and sets the action to "Index", so that the url "/products" is routed to the Products_IndexAction. We could achieve the same effect if we just set the action to Index in the products route, but then any url starting with "/products" that is not matched by a child-route would still go to the Products_IndexAction and this is not what we intended - we want the url "/productsfoobar" to go to the 404 page.
The next important thing to note is the products.product route. It sets the action to "Product" but as the comment explains, this action itself should not exist. It's just a folder for all the subactions for a product. The pattern matches and url that contains a slash followed by an id of only digits. So the url "/products/123" would first get matched by the route "products" where the string "/products" is cut from the url. This leaves us with "/123" which will be matched by the products.product route. This would still send us to the 404 page as the Products_ProductAction does not exist, but the remaining empty string will get matched by the products.product.view route. So the final action is Products_ProductViewAction.
The last interesting route is the products.product.gallery route. It contains an optional paging parameter and by using the defaults tag sets a default value in case it's not provided by the user. Note how the optional page parameter contains a "/" as prefix and how the prefix must be placed in the default as well.
As you can see, all routes that actually map to an existing Module/Action pair are anchored on the right. That prevents that urls are matched that start out with a valid part but contain trailing garbage.

