When looking at Drupal code, it can be hard to know where to start. Dive into any of the directories in a standard installation, and you'll find file after file of PHP code. In the root directory alone, you'll find five files! Where do we start? Determining this is quite simpler than it seems; all you have to do is look at the URL of any Drupal site.
The above is a typical URL for a Drupal site. Most Drupal sites use a feature called "Clean URLs". This removes a lot of the nuts and bolts of the address that helps us understand it. Without Clean URLs enabled, the above address looks like this:
Okay, that's a little helpful, but it doesn't tell us what we want to know. What file amongst all the files in Drupal do we look at? It turns out that many websites will look for a default file in the site's root directory if one isn't specified in the URL. In the Web 1.0 days, this was "index.html", for modern PHP websites this is "index.php".
Four Lines to Rule Them
When we open of index.php, one might expect a lot of code. Instead (aside from comments) you'll find four lines:
require_once DRUPAL_ROOT . '/includes/bootstrap.inc';
The first two lines are very straightforward. The first line defines a constant, DRUPAL_ROOT, containing the file path to the Drupal installation. This constant is then used in the second line, to load bootstrap.inc in the includes/ subdirectory. Despite the file extension, bootstrap.inc contains PHP code. Drupal uses the *.inc extension to prevent files from being executed directly. This way, someone cannot plug in http://example.com/includes/bootstrap.inc and get back a valid page.
That leaves two more lines, and that's where all the fun starts. At this point we've only defined the DRUPAL_ROOT constant and imported an include file. We have yet to execute any real code. Any Drupal page request can be thought of as a two step process (if you're a core dev, insert your groans here): Initialize the Drupal environment, and then handle a page request. The third line of code, drupal_bootstrap() does the initialization -- it forges a connection to the database, loads modules, loads needed data into memory, and prepares everything for the final line of code in index.php.
The menu_execute_active_handler() function handles the page request. This is the part that takes the ?q= half of the URL and produces a web page for the visitor to see. This is actually a new function in Drupal 7. Drupal 6 preferred to bury the request handling within drupal_bootstrap(). Separating the two makes things much, much cleaner.
Making Invocation Easier
In the last article in this series, I mentioned that invoking the debugger was somewhat tricky. Eclipse assumes that any code you're debugging is in a project and not external files. It turns out that the PHP Development Toolkit actuall anticipates this problem and solves it rather easily.
- Create a new PHP project by using the File menu:
File -> New -> PHP Project
- In the provided field, enter a Name of your choice that is unique within your workspace.
- Under the Contents box, select the Create Project at existing location (from existing source) radio button.
- Click the Browse button adjacent to the radio button. Select the root of your Drupal install.
- Click Finish.
This creates a new project in Eclipse importing all of Drupal's files. This mode of project creation does not create a copy of the files within the Eclipse workspace, but rather creates references to the actual files on the web server. Thus, any changes you make to the files in the Eclipse UI affect Drupal's actual files. This is great for our purpose of exploration, but can problematic. Remember, don't hack core!
Once the project is created, invoking the debugger is easy, easy, easy. Double-click index.php in the project explorer, then start the debugger with Run -> Debug As -> PHP Web Page.
Booting Up Drupal
Once you've started a new debugging session, we can dig into drupal_bootstrap(). The function looks a little circuitous at first: An array of phases is initialized, and then a while loop begins, iterating through a seven step process in which key subsystems and dependents of those subsystems are initialized in the correct order. Here's how it boils down:
- DRUPAL_BOOTSTRAP_CONFIGURATION -- Do some basic setup and load settings.php.
- DRUPAL_BOOTSTRAP_PAGE_CACHE -- If page caching is enabled, and the request is asking for a cached page, return the page.
- DRUPAL_BOOTSTRAP_DATABASE -- Connect to the database.
- DRUPAL_BOOTSTRAP_VARIABLES -- Load variables and enabled modules required for bootstrap.
- DRUPAL_BOOTSTRAP_SESSION -- Load the user's session from the DB. If the request isn't from a logged in user, return an anonymous user.
- DRUPAL_BOOTSTRAP_PAGE_HEADER -- Invoke hook_boot(), send default HTTP headers.
- DRUPAL_BOOTSTRAP_LANGUAGE -- Initialize the language system for translations.
- DRUPAL_BOOTSTRAP_FULL -- The phase name is better thought as "Last", rather than "full". Load all enabled modules. Invoke hook_init().
Why the added complexity? Why not just use a static series of functions? The initialization process can be very complicated. For example, you may be in the middle of one phase, like DRUPAL_BOOTSTRAP_PAGE_CACHE, and you'll need to load something from the database. The code that's initializing the page cache will invoke drupal_bootstrap(DRUPAL_BOOTSTRAP_DATABASE). Furthermore, each phase is dependent on each preceding phase -- you cannot do the DRUPAL_BOOTSTRAP_VARIABLES phase without doing DRUPAL_BOOTSTRAP_DATABASE, DRUPAL_BOOTSTRAP_PAGE_CACHE, DRUPAL_BOOTSTRAP_CONFIGURATION. This is used by index.php itself when it calls drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL). Since DRUPAL_BOOTSTRAP_FULL is the last phase in the list, it is dependent on all previous phases.
Each phase of the bootstrap process has a lot of code behind it. While I could write an article for each, there's so much more Drupal to cover. We haven't even served a page yet! That's what we'll do next when we cover menu_execute_active_handler().