What is a good localhost server to use

Configure a minimal Apache server

Title: Configuring Apache Minimally
Author: Christian Folini
Tutorial no: 2
Last update: 10/28/2019
Release date: 11/06/2010
Difficulty: easy
Duration: 1 / 2h

What do we do?

We configure a minimal Apache web server and address it with curl, the TRACE method and from.

Why are we doing this

A secure server is a server that only allows as much as is really needed. Ideally, you build a server on the basis of a minimal system by adding additional features one after the other. This is also preferable for reasons of understanding, because only in this case can you understand what is really configured. It is also helpful to use a minimal system when troubleshooting. If the error does not yet exist in the minimal system, the features are switched on individually and a new search is made for the error. As soon as it appears, it is clear that it is associated with the configuration directive that was last connected.

requirements

Step 1: Create a minimal configuration

Our web server is stored on the file system under. Below is its standard configuration. This is very extensive and difficult to understand. But at least everything is still in a single file. For the Apache versions in the various Linux distributions, the standard configuration is not only very complicated, it is also fragmented into various separate files. These are distributed over several directories. This can make it difficult to get a good idea of ​​what is actually going on. For the sake of simplicity, we will replace this extensive configuration file with the following, greatly simplified configuration.

Step 2: understand configuration

Let's go through this configuration step by step.

We set the ServerName on Localhostbecause we're still working on a lab setup. The fully qualified host name of the service must be entered here for production. In short: colloquially the URL.

Above all, the server needs an email address for the administrator to display the error pages. She will with that ServerAdmin set.

The ServerRoot Directory denotes the main or root directory of the server. It is the symlink set as a trick in instruction 1. This benefits us now, because by moving this symlink we can try out differently compiled Apache versions next to one another without having to change anything in the configuration file.

Then we tell the server User and Group the user and his group. This makes sense because we want to avoid running the server as the root process. Rather, the master or parent process will run as root, but the actual server or child processes and their threads run under the name set here. The user www-data is the common name under a Debian / Ubuntu system. Other distributions use different names. Please make sure that the username you have chosen and the associated group are actually available on the system.

The PidFile Specifies the file in which Apache should write its process ID number. The selected path corresponds to the default value. It is listed here so that you don't have to look for this path in the documentation later.

ServerTokens defines the self-name of the server. Productive tokens are made with Prod set. This means that the server will only act as a Apache and does not also show the version number and loaded modules and is thus a bit more discreet. Let us have no illusions: The server version can be determined with little effort via the Internet, but we still do not need to send it along with every communication as part of the sender.

UseCanonicalName tells the server which Host name and which port he should use it when he has a link to post on himself. With the value On we determine that the ServerName is to be used. An alternative would be to use the host header sent by the client, which we don't want in our setup.

The TraceEnable-Directive prevents certain espionage attacks on our setup. The HTTP method TRACE Instructs the web server to return the request it has received 1: 1. This makes it possible to determine whether a proxy server is interposed and whether it has changed the request. Nothing is lost in our simple setup, but in a company network you would prefer to keep this information secret. Let's switch TraceEnable so to be on the safe side by default.

Time-out Roughly speaking, denotes the maximum time in seconds that can be used to process a request. In fact, it is a bit more complicated, but the details needn't concern us for the moment. The standard value is very high at 60 seconds. We reduce it to 10 seconds.

MaxRequestWorkers is the maximum number of threads that work in parallel to answer inquiries. The standard value is a bit high again. Let's set it to 100. If we reach this value in production, we will have a lot of traffic.

By default, the Apache server listens to the network at every available address. For our tests we only leave it on the IPv4 localhost Address and listen on the standard HTTP port 80. Several Lists- Directives one after the other are easily possible; for us one is enough at the moment.

Now we load five modules:

  • mpm_event_module: process model "event"
  • unixd_module: Access to Unix usernames and groups
  • log_config_module: Free definition of the access log
  • authn_core_module: basic module for authentication
  • authz_core_module: basic module for authorization

In Lesson 1 we precompiled all the modules supplied. Here we only include the most important ones in our configuration. _mpm_eventmodule and _unixdmodule are necessary for the operation of the server. When compiling the first tutorial, we chose the process model event decided that we now activate this by loading the module. Interesting: With Apache 2.4, such a basic setting as the process model of the server can be selected using the configuration. The module unixd we need to run the server, as described above, under the username we have defined.

The log module _log_configmodule allows us a free definition of the access log, which we will make use of in the following. Finally the two modules _authn_coremodule and _authz_coremodule. The first part of the name refers to authentication (Authn) and authorization (Authz). Core then means that these modules are the basis for these functions.

When it comes to access protection, one often speaks of AAA, so Authentication, Authorization and Access Control. Authenticating means checking the identity of a user. Authorization means determining the access rights of a previously authenticated user. Finally, access control means the decision as to whether an authenticated user is permitted with the access rights that have just been determined. We lay the basis for this mechanism by loading these two modules. All other modules with the two abbreviations authn and authz, of which there are a large number, presuppose these modules. For the moment we actually only need the authorization module, but by loading the authentication module we are preparing for future expansions.

With ErrorLogFormat we intervene in the format of the error log file. We are expanding the common log format somewhat by defining the time stamp very precisely. thus corresponds to an entry such as. That means the date is noted backwards, then the time with an accuracy of microseconds. Reversing the date has the advantage that the times can be clearly arranged in the log file; the microseconds give us precise information about the point in time of an entry and allow certain conclusions to be drawn about the duration of processing in various modules. This is also used by the next configuration part, which is the logging module and the Log level, i.e. the severity of the error. This is followed by the IP address of the client (); a unique identification of the request (); a so-called unique ID, which can be used to correlate requests in later instructions) and finally the actual message, which we reference using.

With LogFormat let's define a format for the access log file. We call it combined. This common format includes client IP address, time stamp, method, path, HTTP version, HTTP status code, response size, referer and the name of the browser (user agent). We choose a rather complicated construction for the timestamp. The reason is the will to be able to display the timestamps in the same format in the error log and in the access log. While we have a simple identification for this in the error log, we have to laboriously construct the time stamp in the case of the access log format.

The LogLevel for the error log file we provide with Debug to the highest level. It's too talkative for production, but it makes perfect sense in the laboratory. Apache is usually not very talkative, so that you can usually get along well with the amount of data.

We will notify the error log file ErrorLog the path logs / error.log to. This path is relative to the ServerRoot-Directory.

That defined LogFormat combined we are now using for our access log file named logs / access.log.

The web server delivers files. He looks for this on a disk partition or he generates it with the help of an installed application. We are still with the simple case and give the server via DocumentRoot knownwhere to find the files. / apache / htdocs is an absolute path under that ServerRoot. A relative path could also stand here, but we'd better work with clear relationships here! Concretely means DocumentRootthat the url path / to the operating system path / apache / htdocs is mapped.

Now follows one Directory-Block. With this block we prevent files outside of the designated by us DocumentRoot be delivered. For the path / we forbid any access using the directives Require all denied. This entry references the authentication (Alles), makes a statement about authorization (Require) and defines the access: denied, so no access for anyone; at least not for the directory /.

The directive Options we put on SymLinksIfOwnerMatch. With Options we can specify which special features should be taken into account when delivering the directory /. Actually none at all, so we would have options in production None write. But in our case we have DocumentRoot placed on a symbolic link and it will only be searched for and found if we have the server with SymLinksIfOwnerMatch instruct to allow symlinks below /. At least if the ownership structure is clean. For security reasons, it is better to avoid symlinks when serving files on productive systems. But with our test system, convenience still comes first.

Now we're opening one VirtualHost. It corresponds to the one defined above Lists-Directive. Together with the one just defined DirectoryBlock, it specifies that our web server does not allow any access by default. On the IP address 127.0.0.1, port 80 but we want to allow access and that is defined within this block.

Specifically, we allow access to our DocumentRoot to. The key statement here is this Require all granted, which we use in contrast to the directory / allow full access. In contrast to the above, no symlinks are provided from this path and no other special skills either: Options None.

Step 3: start the server

This describes our minimal server. It would be possible to define an even tighter server. But it would no longer be as comfortable to work with as ours and it would no longer be safe either. However, a certain basic security is appropriate. Because if we set up a service in the laboratory, it should be possible to move it to a productive environment with selective adjustments. Trying to secure a service from scratch shortly before going live is an illusion.

Let's start the server again as in Lesson 1 in the foreground and not as a daemon:

Step 4: address the server with curl

We can now address the server with the browser again. But from the shell you can first work cleaner and better understand what is happening:

This gives the following result.

So we made an HTTP call and received a response from our minimally configured server that met our expectations.

Step 5: examine the request and response

So that's what happens with an HTTP request. But what exactly does the server answer us? We call for this curl again on. This time with the option verbose.

The lines marked with an * describe messages for establishing and clearing the connection. They do not reflect any network traffic. Then follows with> the request and with

An HTTP request actually consists of 4 parts:

  • Request line and request header
  • Request body (optional and missing here for a GET request)
  • Response header
  • Response body

The first parts don't need to be of interest here. They are interesting Response header. That's the part that the web server uses to describe the answer. The real answer, the Response body, then follows after a blank line.

What do the headers say one after the other?

First comes the statusLine with the protocol including the version, then the Status code. 200 OK is the normal response from a web server. On the next line we see the date and time of the server. The following line begins with an asterisk, *, and thus designates a to curl corresponding line. The message has with curls Treatment of HTTP pipelining to do what need not be of further interest to us. Then follows the serverLine on which our web server identifies itself as Apache. This is the closest possible identification. We have them with us ServerTokens Prod defined.

The server then reports when the file on which the response is based was last changed; so the Unix modified timestamp. ETag and Accept-Ranges do not need to be of interest for the moment. That is more interesting Content-Length. This indicates how many bytes are in the Response body can be expected. In our case that is 45 bytes.

Incidentally, the order of these headers is characteristic of a web server. NginX uses a different order and brings the Server header for example before the date. Apache can therefore also be identified if the server line should mislead us.

Step 6: Examine the answer a little more closely

It is possible to dig a little deeper when communicating curl to look inside. This is done using the command line parameters --trace-ascii:

The parameter --trace-ascii requires a file as a parameter to include a Ascii dump of communication. "-" works as a shortcut too STDOUTso that we can easily display the transcript.

Across from verbose brings trace-ascii more details on the length of the bytes transferred in the Request- and Response-Phase. In the example above, the request headers thus comprised 83 bytes. In the response, the bytes are then listed per header line and as a lump sum for the body of the response: 45 bytes. This may all sound like splitting hairs. In fact, it is sometimes crucial if you miss a bit and you are not quite sure what was delivered where and in which order. For example, it is noticeable that 2 bytes are added to the header lines. These are the CR (Carriage Return) and NL (New Line), which the HTTP protocol provides in the header lines. It is different in the response body, where only what is actually in the file is returned. This is obviously just a NL without a CR. On the third line from the bottom (0000: html ...) the greater than sign is followed by a period. This is a paraphrase of the NL character of the answer, which, like other escape sequences, is also given in the form of a point.

Step 7: work with the trace method

Above I have the directive TraceEnable described. We have them on just to be on the safe side off switched. However, it can be very useful for troubleshooting. So let's try it out. Let's set the option to on:

We restart the server and issue the following curl request.

So we call the familiar Url with the HTTP method TRACE (instead of GET) on. As a result we expect the following:

in the body the server repeats the information about the sent request as intended. In fact, the lines here are identical. So we can confirm that nothing happened to the request en route. But if we had gone through one or more proxy servers, there would definitely be more here Header-Lines we like that as well as Client can see. We will learn more powerful troubleshooting tools at a later date. But we should completely ignore them TRACEMethod still not.

Do not forget, TraceEnable turn off again.

Step 8: Feel the server with "ab" on the tooth

That's it for now with the simple server. But just for fun, we can check it out a little more. We are staging a small load test from; short for Apache bench. This is a very simple load test program that is always at hand and can provide quick first results on performance. So I like to run it before and after a configuration change to get an idea of ​​whether something has changed in terms of performance. From is not very powerful and the local call does not bring any clean results either. But such a first look can be obtained with this aid.

We start with concurrency 1. This means that we only make one request at the same time. In total we put 1000 inquiries on the known Url. Here is the output from from:

What is of particular interest to us is the number of errors (Failed requests) and the number of requests per second (Request per second). A value over a thousand is a good start. Especially since we are still working with a single process and not with a parallelized daemon (and therefore also the concurrency-level is set to 1).

Step 9: view directives and modules

At the end of this tutorial, we will look at the various directives that an Apache started with our configuration file knows. The various loaded modules expand the server's instruction set. The configuration parameters thus available are well documented on the project website. In fact, in special cases it can be helpful to have an overview of the directives available through the loaded modules. The directives are obtained with the command line flag -L.

The directives follow the order in which they are loaded. A brief description of the functionality follows for each directive.

With this list it is now possible to find out whether all loaded modules are really needed in the configuration or are referenced. In more complicated configurations with numerous loaded modules, it can finally happen that you are unsure whether you are really using all the modules.

You can read the modules from the configuration file, the output from httpd -L for each module and then look again in the configuration file to see whether one of the listed directives is being used. This nested query is a nice finger exercise that I highly recommend. For me, I solved it as follows:

The _authncore so is not used. That is correct; We described it like this above, because it is loaded for future use. The other modules seem necessary.

So much for this tutorial. This means that a suitable web server is already available that is easy to work with. We'll build on it in the next few lessons.

Newsletter

Was this tutorial fun? Then our newsletter with information on new articles here at netnea would be the right one. Register here.
The newsletter is published in English.

References

License / copy / reuse


This work is licensed as follows / This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Changelog

  • October 28, 2019: Changes in the German text made in accordance with the English text
  • February 25, 2017: New wording of a sentence in the introduction, AllowOverride completely removed
  • August 25, 2016: Line breaks adjusted
  • January 16, 2016: License notice added
  • September 14, 2015: Spelling (Benjamin Affolter)
  • September 24, 2015: Update to Apache 2.4, extension by --trace-ascii, httpd -L
  • September 21, 2015: html -> markdown
  • July 24, 2013: Misspellings
  • July 9, 2013: Step 3: "cd / apache" in front; Output of the AB call adapted
  • July 2, 2013: Corrected spelling errors in configuration (DefaultType vs. ContentType), updated verbose curls call
  • May 31, 2013: Spelling errors corrected
  • April 9, 2013: Mime module added, PidFile defined
  • February 27, 2011: Corrected spelling errors
  • February 25, 2011: Spelling errors corrected
  • January 20, 2011: Revised
  • November 6, 2010: Revised
  • November 4th and 5th, 2010: Created