
.. index::
   double: HOSTLIST; subsystem

.. _HOSTLIST-Subsystem:

HOSTLIST — HELLO bootstrapping and gossip
=========================================

Peers in the GNUnet overlay network need address information so that
they can connect with other peers. GNUnet uses so called HELLO messages
to store and exchange peer addresses. GNUnet provides several methods
for peers to obtain this information:

-  out-of-band exchange of HELLO messages (manually, using for example
   gnunet-peerinfo)

-  HELLO messages shipped with GNUnet (automatic with distribution)

-  UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast)

-  topology gossiping (learning from other peers we already connected
   to), and

-  the HOSTLIST daemon covered in this section, which is particularly
   relevant for bootstrapping new peers.

New peers have no existing connections (and thus cannot learn from
gossip among peers), may not have other peers in their LAN and might be
started with an outdated set of HELLO messages from the distribution. In
this case, getting new peers to connect to the network requires either
manual effort or the use of a HOSTLIST to obtain HELLOs.

.. _HELLOs:

HELLOs
------

The basic information peers require to connect to other peers are
contained in so called HELLO messages you can think of as a business
card. Besides the identity of the peer (based on the cryptographic
public key) a HELLO message may contain address information that
specifies ways to contact a peer. By obtaining HELLO messages, a peer
can learn how to contact other peers.

.. _Overview-for-the-HOSTLIST-subsystem:

Overview for the HOSTLIST subsystem
-----------------------------------

The HOSTLIST subsystem provides a way to distribute and obtain contact
information to connect to other peers using a simple HTTP GET request.
Its implementation is split in three parts, the main file for the
daemon itself (``gnunet-daemon-hostlist.c``), the HTTP client used to
download peer information (``hostlist-client.c``) and the server
component used to provide this information to other peers
(``hostlist-server.c``). The server is basically a small HTTP web server
(based on GNU libmicrohttpd) which provides a list of HELLOs known to
the local peer for download. The client component is basically a HTTP
client (based on libcurl) which can download hostlists from one or more
websites. The hostlist format is a binary blob containing a sequence of
HELLO messages. Note that any HTTP server can theoretically serve a
hostlist, the built-in hostlist server makes it simply convenient to
offer this service.

.. _Features:

Features
^^^^^^^^

The HOSTLIST daemon can:

-  provide HELLO messages with validated addresses obtained from
   PEERINFO to download for other peers

-  download HELLO messages and forward these message to the TRANSPORT
   subsystem for validation

-  advertises the URL of this peer's hostlist address to other peers via
   gossip

-  automatically learn about hostlist servers from the gossip of other
   peers

.. _HOSTLIST-_002d-Limitations:

HOSTLIST - Limitations
^^^^^^^^^^^^^^^^^^^^^^

The HOSTLIST daemon does not:

-  verify the cryptographic information in the HELLO messages

-  verify the address information in the HELLO messages

.. _Interacting-with-the-HOSTLIST-daemon:

Interacting with the HOSTLIST daemon
------------------------------------

The HOSTLIST subsystem is currently implemented as a daemon, so there is
no need for the user to interact with it and therefore there is no
command line tool and no API to communicate with the daemon. In the
future, we can envision changing this to allow users to manually trigger
the download of a hostlist.

Since there is no command line interface to interact with HOSTLIST, the
only way to interact with the hostlist is to use STATISTICS to obtain or
modify information about the status of HOSTLIST:

::

   $ gnunet-statistics -s hostlist

In particular, HOSTLIST includes a **persistent** value in statistics
that specifies when the hostlist server might be queried next. As this
value is exponentially increasing during runtime, developers may want to
reset or manually adjust it. Note that HOSTLIST (but not STATISTICS)
needs to be shutdown if changes to this value are to have any effect on
the daemon (as HOSTLIST does not monitor STATISTICS for changes to the
download frequency).

.. _Hostlist-security-address-validation:

Hostlist security address validation
------------------------------------

Since information obtained from other parties cannot be trusted without
validation, we have to distinguish between *validated* and *not
validated* addresses. Before using (and so trusting) information from
other parties, this information has to be double-checked (validated).
Address validation is not done by HOSTLIST but by the TRANSPORT service.

The HOSTLIST component is functionally located between the PEERINFO and
the TRANSPORT subsystem. When acting as a server, the daemon obtains
valid (*validated*) peer information (HELLO messages) from the PEERINFO
service and provides it to other peers. When acting as a client, it
contacts the HOSTLIST servers specified in the configuration, downloads
the (unvalidated) list of HELLO messages and forwards these information
to the TRANSPORT server to validate the addresses.

.. _The-HOSTLIST-daemon:

:index:`The HOSTLIST daemon <double: daemon; HOSTLIST>`
The HOSTLIST daemon
-------------------

The hostlist daemon is the main component of the HOSTLIST subsystem. It
is started by the ARM service and (if configured) starts the HOSTLIST
client and server components.

GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT
If the daemon provides a hostlist itself it can advertise it's own
hostlist to other peers. To do so it sends a
``GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT`` message to other peers
when they connect to this peer on the CORE level. This hostlist
advertisement message contains the URL to access the HOSTLIST HTTP
server of the sender. The daemon may also subscribe to this type of
message from CORE service, and then forward these kind of message to the
HOSTLIST client. The client then uses all available URLs to download
peer information when necessary.

When starting, the HOSTLIST daemon first connects to the CORE subsystem
and if hostlist learning is enabled, registers a CORE handler to receive
this kind of messages. Next it starts (if configured) the client and
server. It passes pointers to CORE connect and disconnect and receive
handlers where the client and server store their functions, so the
daemon can notify them about CORE events.

To clean up on shutdown, the daemon has a cleaning task, shutting down
all subsystems and disconnecting from CORE.

.. _The-HOSTLIST-server:

:index:`The HOSTLIST server <single: HOSTLIST; server>`
The HOSTLIST server
-------------------

The server provides a way for other peers to obtain HELLOs. Basically it
is a small web server other peers can connect to and download a list of
HELLOs using standard HTTP; it may also advertise the URL of the
hostlist to other peers connecting on CORE level.

.. _The-HTTP-Server:

The HTTP Server
^^^^^^^^^^^^^^^

During startup, the server starts a web server listening on the port
specified with the HTTPPORT value (default 8080). In addition it
connects to the PEERINFO service to obtain peer information. The
HOSTLIST server uses the GNUNET_PEERINFO_iterate function to request
HELLO information for all peers and adds their information to a new
hostlist if they are suitable (expired addresses and HELLOs without
addresses are both not suitable) and the maximum size for a hostlist is
not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000). When PEERINFO finishes
(with a last NULL callback), the server destroys the previous hostlist
response available for download on the web server and replaces it with
the updated hostlist. The hostlist format is basically a sequence of
HELLO messages (as obtained from PEERINFO) without any special
tokenization. Since each HELLO message contains a size field, the
response can easily be split into separate HELLO messages by the client.

A HOSTLIST client connecting to the HOSTLIST server will receive the
hostlist as an HTTP response and the server will terminate the
connection with the result code ``HTTP 200 OK``. The connection will be
closed immediately if no hostlist is available.

.. _Advertising-the-URL:

Advertising the URL
^^^^^^^^^^^^^^^^^^^

The server also advertises the URL to download the hostlist to other
peers if hostlist advertisement is enabled. When a new peer connects and
has hostlist learning enabled, the server sends a
``GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT`` message to this peer
using the CORE service.

HOSTLIST client
.. _The-HOSTLIST-client:

The HOSTLIST client
-------------------

The client provides the functionality to download the list of HELLOs
from a set of URLs. It performs a standard HTTP request to the URLs
configured and learned from advertisement messages received from other
peers. When a HELLO is downloaded, the HOSTLIST client forwards the
HELLO to the TRANSPORT service for validation.

The client supports two modes of operation:

-  download of HELLOs (bootstrapping)

-  learning of URLs

.. _Bootstrapping:

Bootstrapping
^^^^^^^^^^^^^

For bootstrapping, it schedules a task to download the hostlist from the
set of known URLs. The downloads are only performed if the number of
current connections is smaller than a minimum number of connections (at
the moment 4). The interval between downloads increases exponentially;
however, the exponential growth is limited if it becomes longer than an
hour. At that point, the frequency growth is capped at (#number of
connections \* 1h).

Once the decision has been taken to download HELLOs, the daemon chooses
a random URL from the list of known URLs. URLs can be configured in the
configuration or be learned from advertisement messages. The client uses
a HTTP client library (libcurl) to initiate the download using the
libcurl multi interface. Libcurl passes the data to the
callback_download function which stores the data in a buffer if space is
available and the maximum size for a hostlist download is not exceeded
(MAX_BYTES_PER_HOSTLISTS = 500000). When a full HELLO was downloaded,
the HOSTLIST client offers this HELLO message to the TRANSPORT service
for validation. When the download is finished or failed, statistical
information about the quality of this URL is updated.

.. _Learning:

:index:`Learning <single: HOSTLIST; learning>`
Learning
^^^^^^^^

The client also manages hostlist advertisements from other peers. The
HOSTLIST daemon forwards ``GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT``
messages to the client subsystem, which extracts the URL from the
message. Next, a test of the newly obtained URL is performed by
triggering a download from the new URL. If the URL works correctly, it
is added to the list of working URLs.

The size of the list of URLs is restricted, so if an additional server
is added and the list is full, the URL with the worst quality ranking
(determined through successful downloads and number of HELLOs e.g.) is
discarded. During shutdown the list of URLs is saved to a file for
persistence and loaded on startup. URLs from the configuration file are
never discarded.

.. _Usage:

Usage
-----

To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES
section for the ARM services. This is done in the default configuration.

For more information on how to configure the HOSTLIST subsystem see the
installation handbook: Configuring the hostlist to bootstrap Configuring
your peer to provide a hostlist
