Understanding how Drupal gets from a page URL to page content is a complex topic involving many moving parts. In previous parts we've covered how Drupal initializes itself, we haven't discussed how a requested URL is routed within the CMS to the code that produces the web page. Handling this is the beating heart of Drupal -- the Menu System. Before we get to that, however, we need discussed modules. We glossed over them in the last part, but here we dive into detail.
It occurred to me after writing the last part of this series, I didn't go into more than high-level detail of drupal_bootstrap(). One of the biggest questions I had when I began this exploration was "How are modules loaded?" Coming from a self-taught C and C++ background, loading external code seemed a black art. Thankfully, PHP makes this very simple.
Apart from very simple applications or sample code, most programming projects are broken up into multiple files. C and C++ have a pre-processing directive called "#include". This isn't actual code, but a command sent to the compiler. What I failed to realize as I moved on to other languages -- especially those like PHP that are interpreted rather than complied -- is that PHP's method is actual code, and that's the key to how Drupal loads modules.
PHP has several ways to include a dependent file, include(), include_once(), require(), and require_once(). The difference between includes and requires is if an error is tossed if the file cannot be included. The *_once's prevent a file from being imported twice, thereby causing definition conflicts. You quickly learn that require_once() is the version you'll use the most. Another astonishing fact to C/C++ devs new to PHP is that all of these statements can accept variables.
How does Drupal load module code? Like this:
That's the core of it. You'll notice that the include_once() statement is used instead of the require_once(). This is because Drupal does not want to give you a White Screen of Death if a module's files have been misplaced. Drupal does use require_once() when loading other core files like bootstrap.inc or common.inc. After all, if those are missing, you're really in trouble!
It Can't be that Simple
You're right. It's not. While the basic idea of loading module code is an include_once() inside a foreach loop, the actual process is a bit more complicated. As we learned in the last part of this series, drupal_bootstrap() does not load all modules at once but in two distinct steps. During the DRUPAL_BOOTSTRAP_VARIABLES phase, several modules are loaded that are required for the bootstrap process to complete. If you're stepping through a default installation of Drupal, there are three bootstrap modules: devel, dblog, and overlay. The process looks like this:
module_load_all($bootstrap = TRUE)
module_list($bootstrap = TRUE)
Loop through list of bootstrap modules (devel, dblog, overlay)
We already know the first two, but the third, module_load_all() is new. This function gets a list of modules by calling module_list(). The $bootstrap parameter isn't used by module_load_all(), but is passed to module_list(), instructing it to return only bootstrap modules.
With an array of modules in hand, drupal_load() is called for each element. It checks to see if the file was already loaded, and then gets the module file name. A lot of care is taken in getting this file name and making sure that it's correct, and points to actual PHP code.
The module loading process is the same for non-bootstrap modules (including contrib modules) as it is for bootstrap modules. During the DRUPAL_BOOTSTRAP_FULL phase, a default installation of Drupal will load block, color, comment, contextual, dashboard, dblog[skipped], devel_generate, field, field_sql_storage, field_ui, file, filter, help, image, list, menu, node, number, options, overlay[skipped], path, rdf, search, shortcut, system, taxonomy, text, toolbar, update, user, devel[skipped], and standard.
Notice that in the above list, the bootstrap modules are drawn into the global module loading process. Thankfully, Drupal already knows that these modules have been loaded, and skips the process. The psudocode looks like it did earlier:
module_list($bootstrap = FALSE)
Loop through list of contrib modules
What about Hooks?
Loading modules isn't the only thing that's seemingly magic about modules. Module developers rely on a huge API of hooks into the Drupal process in order to add features, modify display, perform access restrictions, and so on.
It really comes down to a function called module_invoke(). It takes two parameters, the module name, and the hook to call. How it works its magic to call a function is via the PHP call_user_func_array() statement. This statement doesn't require a pointer or anything fancy to the function to invoke, just the function's name! Drupal modules implement individual hooks by creating functions with the moduleName_hookName() signature. This is very clearly represented in module_invoke():
Get any additional unnamed $arguments as an array
If the module implements the hook
return call_user_func_array($moduleName . '_' . $hook, $arguments)
The module_invoke function actually can take more than the two named parameters thanks to PHP. These additional parameters are packaged up as array and passed to call_user_func_array(), which in turns passes the contents of the array as parameters to the hook implementation. This, and the fact that module_invoke() returns the result of the invoked hook, make it a smart function invoker cabable of working across modules.
Often in Drupal core, you do not simply invoke a single hook implementation, but call all the implementations of a hook at a single point. Handling this is a sister function to module_invoke(), module_invoke_all(). It's used the same exact way as module_invoke(), but lacks the $moduleName parameter.
While modules provide Drupal's extensibility, Menu Routing does the real work of the CMS. At first blush, you may think that the "menu" refers to something like the primary and secondary links, or even the navigation block. After all, all of those are called "menus" in Drupal's own UI!
Repeat after me: Menu Routing has nothing to do with Menus, everything to do with Routing.
Page routing, specifically. The Menu [Routing] System is the core component of Drupal that takes a URL of the requested page and returns a viewable HTML web page. That is, it takes the part of the url after "?q=" in the follow example:
In the above example, the page part of the URL is "the/requested/page", and this is exactly what the menu system handles. It matches up the page URL with the PHP function needed to generate the web page, and executes the function. When coding a module, the Menu System is often the first thing you'll code after an *.info. file. It's perhaps the most approachable piece of API in the entire project, which is why I bring it up here:
$items[my/motley/page] = array(
'page callback' => 'motleymod_my_page',
return "<p>This is my motley page.</p>";
The above code would work relatively unchanged as far back as Drupal 5.0. We have two functions, the first implements hook_menu(), and the other generates and returns the page content. You'll notice that the first function defines and returns an array with one element with the key 'my/motley/page'. As you might have guessed, this is the page URL! The content of the element is itself an array containing one element, 'Page Callback'.
The Page Callback parameter instructs Drupal what PHP function to invoke to generate the page. In this case, motleymod_my_page(), which returns raw HTML to display in the body section of the webpage. Drupal takes care of everything else -- headers, blocks, footer, and all.
But That's Not the End of the Story
If you think the above example was a little spare, you're right. Let's look at a more complex example of hook_menu():
$items[my/motley/page] = array(
'page callback' => 'motleymod_my_page',
$items[admin/config/motley] = array(
'title' => "Configure Motley Module!",
'page callback' => 'drupal_get_form',
'page arguments' => array('motleymod_admin_settings'),
'access arguments' => array('administer motleymod'),
'type' => MENU_NORMAL_ITEM,
'file' => 'motleymod.admin.inc'
Whoa! What the heck is all that stuff? We see our original entry for 'my/motley/page', but now there's an additional entry for 'admin/config/motley'. This time, with a lot more parameters than just Page Callback:
- Title is the title of the page to display. It's used both in the <title> tag of the generated page, as well as in the <h1> tag in the body section of the page.
- Page Callback is the the magic parameter. It tells Drupal what function to call!
- Page Arguments specify the arguments to pass to the function specified in the Page Callback parameter.
- Access Arguments specify the access permissions (under admin/people/permissions) the current user must have in order to access the page.
- Type is the kind of menu item this item represents. More on that later.
- File tells Drupal in what file to find the function specified in the Page Callback parameter.
Again, we have a Page Callback, but something seems screwy: It's set to drupal_get_form() -- a function provided by Drupal itself. How the heck does that work? Over the history of Drupal, it was discovered that many module's Page Callbacks returned pages with similar structures. Furthermore, pages started falling into classifications of pages. Content pages usually have a title and a bit of text to display. Settings pages usually have a title and a form with a submit button. Drupal-provided page generators, like drupal_get_form(), handle a lot of rendering work for you, making module developer's lives a little easier. In this case, the function provides a module setting page.
Now that we have a framework, let's tie it back to where we were in our debugger. After drupal_bootstrap(), the last function called in index.php is menu_execute_active_handler(). Without any arguments, the function looks up the page URL, and then digs into the Menu System.
The first thing that it does is check if the site is "offline", that is, in Maintenance Mode. Administrators set this mode under admin/config/development/maintenance, and it temporarily blocks all other visitors save the Admin from accessing the site. This allows the Admin to perform site updates safely. If the site is offline, it doesn't matter what page was requested, all results are sent to _menu_site_is_offline(). This function checks to see if the user is the Admin, if so, normal page routing resumes.
The next thing menu_execute_active_handler() does is check if the Menu System needs to be rebuilt. This is where hook_menu() is invoked. Normally, Drupal 7 stores menu routing information in a database table -- menu_router. It does this because performing a database query is actually faster than searching through all implementations of hook_menu() to find a match. During a rebuild, the contents of menu_router are thrown out. Drupal then locks the database temporarily and recurses through each enabled module and invokes hook_menu(). This results in a huge array of menu items like we saw in the code sample above. This is then committed to the menu_router table.
After that, things get much more straightforward. The appropriate menu item is found, and the function specified in the Page Callback is called passing any Page Arguments as necessary. The result of the Page Callback is then sent to the user's browser for display.
After all we've covered, it's easy to simply accept that how Drupal is written is the best way it can be written. I know that when I started using Drupal, it was after I tried and failed to create my own content manager. "Let me learn from those more experienced than I!" This, however, blinded me to one of the biggest problems Drupal is facing today.
Drupal is heavy.
It used to be that each time a page was requested, you could assume that the entire page was requested. Graphics, HTML, and all. In the new world of refresh-less web apps, however, use AJAX requests to grab content or save information. Instead of sending the entire page, lighter packages of JSON formatted data are exchanged between the user's browser and the web server. While Drupal can handle these requests, it must still go through the entire initialization and routing process it would for a full web page. Worse yet, it must load all modules into memory with each request -- even if that module is not needed in that request.
I discovered this myself when I attended Crell's presentation during the Twin Cities Drupal Camp introducing the Web Services and Context Core Initiative. The goal of WSCCI (pronounced like the drink) is to ultimately replace the heavy drupal_bootstrap() and menu_router() system with a more agile core. Intead of Drupal being just a first-class CMS, it will be a REST server with a first-class CMS on top.
It's an ambitious but, I believe, necessary evolution of the platform. While targeted for Drupal 8, I have my doubts it will be fully realized until Drupal 9.
Unless we help, that is.
We've reached the end of index.php, but not of our series. There's still a lot more to explore, but the going gets more treacherous from here.