.. _tutorial:

==========================
Tutorial
==========================

---------------------
Installing pycounters
---------------------

PyCounters is pure python. All you need is to run easy_install (or pip): ::

    easy_install pycounters


Of course, you can always checkout the code from BitBucket on https://bitbucket.org/bleskes/pycounters

---------------------
Introduction
---------------------

PyCounters is a library to help you collect interesting metrics from production code. As an case study for
this tutorial, we will use a simple Python-based server (taken from the `python docs
<http://docs.python.org/library/socketserver.html#socketserver-tcpserver-example>`_): ::

    import SocketServer

    class MyTCPHandler(SocketServer.BaseRequestHandler):
        """
        The RequestHandler class for our server.

        It is instantiated once per connection to the server, and must
        override the handle() method to implement communication to the
        client.
        """

        def handle(self):
            # self.request is the TCP socket connected to the client
            self.data = self.request.recv(1024).strip()
            print "%s wrote:" % self.client_address[0]
            print self.data
            # just send back the same data, but upper-cased
            self.request.send(self.data.upper())

    if __name__ == "__main__":
        HOST, PORT = "localhost", 9999

        # Create the server, binding to localhost on port 9999
        server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)

        # Activate the server; this will keep running until you
        # interrupt the program with Ctrl-C
        server.serve_forever()


----------------------
Step 1 - Adding Events
----------------------

For this basic server, we will add events to report the following metrics:
 * Number of requests per second
 * Average time for handling a request

Both of these metrics are connected to the handle method of the MyTCPHandler class in the example.
The number of requests per second the server serves is exactly the number of times the handle() method is called.
The average time for handling a request is exactly the average execution time of handle()

Both of these metrics are measure by decorating handle() the :ref:`shortcut <shortcuts>` decorators
:meth:`frequency <pycounters.shortcuts.frequency>` and :meth:`time <pycounters.shortcuts.time>`: ::

    import SocketServer
    from pycounters import shortcuts

    class MyTCPHandler(SocketServer.BaseRequestHandler):
        ...

        @shortcuts.time("requests_time")
        @shortcuts.frequency("requests_frequency")
        def handle(self):
            # self.request is the TCP socket connected to the client
            self.data = self.request.recv(1024).strip()
            print "%s wrote:" % self.client_address[0]
            print self.data
            # just send back the same data, but upper-cased
            self.request.send(self.data.upper())


.. note::
    * Every decorator is given a name ("requests_time" and "requests_frequency"). These names will come back
      in the report generated by PyCounters. More on this in the next section.
    * The shortcut decorators actually do two things - report events and add counters for them. For now,
      it's OK but you might want to separate the two. More on this later in the tutorial

------------------------
Step 2 - Reporting
------------------------

Now that the metrics are being collected, they need to be reported. This is the job of the :ref:`reporters <reporters>`. In this example,
we'll save a report every 5 minutes to a JSON file at /tmp/server.counters.json (check out the :ref:`reporters` section for other options).
To do so, create an instance of :class:`JSONFileReporter <pycounters.reporters.JSONFileReporter>` when the server starts: ::

    import SocketServer
    from pycounters import shortcuts, reporters, start_auto_reporting, register_reporter

    ....

    if __name__ == "__main__":
        HOST, PORT = "localhost", 9999
        JSONFile = "/tmp/server.counters.json"

        reporter = reporters.JSONFileReporter(output_file=JSONFile)
        register_reporter(reporter)

        start_auto_reporting()


        # Create the server, binding to localhost on port 9999
        server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)

        # Activate the server; this will keep running until you
        # interrupt the program with Ctrl-C
        server.serve_forever()

.. note::
    To make pycounters periodically output a report you must call start_auto_reporting()

By default auto reports are generated every 5 minutes (change that by using the seconds parameter of start_auto_reporting() ). After five minutes
the reporter will save it's report. Here is an example of the contest of /tmp/server.counters.json: ::

    {"requests_time": 0.00039249658584594727, "requests_frequency": 0.014266581369872909}



----------------------------------------------------------
Step 3 - Counters and reporting events without a decorator
----------------------------------------------------------

Average request time and request frequency were both nicely measured by decorating MyTCPHandler::handle(). Some metrics
do not fit as nicely into the decorator model.

The server in our example receives a string from the a client and returns it upper_cased. Say we want to measure the
average number of characters the server processes. To achieve this we can use another shortcut function
:meth:`value <pycounters.shortcuts.value>`: ::

    import SocketServer
    from pycounters import shortcuts

    class MyTCPHandler(SocketServer.BaseRequestHandler):
        ...

        @shortcuts.time("requests_time")
        @shortcuts.frequency("requests_frequency")
        def handle(self):
            # self.request is the TCP socket connected to the client
            self.data = self.request.recv(1024).strip()
            print "%s wrote:" % self.client_address[0]
            print self.data

            # measure the average length of data
            shortcuts.value("requests_data_len",len(self.data))

            # just send back the same data, but upper-cased
            self.request.send(self.data.upper())



Until now, the shortcut decorators and functions were perfect for what we wanted to do. Naturally, this is not always
the case. Before going on, it is handy to explain more about these shortcuts and how PyCounters work (see
:ref:`moving_parts` for more about this).

PyCounters is built of three main building blocks:

* *Events* - to reports values and occurrences in your code (in the example: incoming request, the time it took to
    process them and the number of bytes the processed).
* *Counters* - to capture events and analyse them (in the example: measuring requests per second, averaging request
  processing time and averaging the number of bytes processed per request).

* *Reporters* - to periodically generate a report of all active counters.

PyCounters' shortcuts will both report events and create a counter to analyse it. Every shortcut has a default counter
type but you can override it (see :ref:`shortcuts`). For example, say we wanted to measure the *total* number of bytes
the server has processed rather than the average. To achieve this, the "requests_data_len" counter needs to be changed
to :class:`TotalCounter <pycounters.counters.TotalCounter>`. The easiest way to achieve this is to add a parameter
to the shortcut ``shortcuts.value("requests_data_len",len(data),auto_add_counter=TotalCounter)`` (don't forget to change
your imports too). However, we will go another way about it.

PyCounter's event reporting is very light weight. It practically does nothing if no counter is defined to capture those
events. Because of this, it is a good idea to report all important events through the code and choose later what you
exactly want analyzed. To do this we must separate event reporting from the definition of counters.

.. Note::
    When you create a counter, it will by default listen to one event, *named exactly as the counter's name*.
    However, if the events parameter is passed to a counter at initialization, it will listen *only* to the specified
    events.

.. Note::
  This approach also means you can analyze things differently on a single thread, by installing thread specific
  counters. For example, trace a specific request more heavily due to some debug flag. Thread specific counters are not
  currently available but will be in the future.

Reporting an event without defining a counter is done by using one of the functions described under
:ref:`event_reporting` . Since we want to report a value, we will use :meth:`pycounters.report_value`: ::

    import SocketServer
    from pycounters import shortcuts,reporters,report_value

    class MyTCPHandler(SocketServer.BaseRequestHandler):
        ...

        @shortcuts.time("requests_time")
        @shortcuts.frequency("requests_frequency")
        def handle(self):
            # self.request is the TCP socket connected to the client
            self.data = self.request.recv(1024).strip()
            print "%s wrote:" % self.client_address[0]
            print self.data

            # measure the average length of data
            report_value("requests_data_len",len(self.data))

            # just send back the same data, but upper-cased
            self.request.send(self.data.upper())


To add the :class:`TotalCounter <pycounters.counters.TotalCounter>` counter, we change the initialization part of the
code: ::

    import SocketServer
    from pycounters import shortcuts, reporters, report_value,counters, register_counter, start_auto_reporting, register_reporter

    ....

    if __name__ == "__main__":
        HOST, PORT = "localhost", 9999
        JSONFile = "/tmp/server.counters.json"

        data_len_counter = counters.TotalCounter("requests_data_len") # create the counter
        register_counter(data_len_counter) # register it, so it will start processing events

        reporter = reporters.JSONFileReporter(output_file=JSONFile)
        register_reporter(reporter)

        start_auto_reporting()


        # Create the server, binding to localhost on port 9999
        server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)

        # Activate the server; this will keep running until you
        # interrupt the program with Ctrl-C
        server.serve_forever()


---------------------------
Step 4 - A complete example
---------------------------

Here is the complete code with all the changes so far (also available at the PyCounters
`repository <https://bitbucket.org/bleskes/pycounters>`_ ): ::

    import SocketServer
    from pycounters import shortcuts, reporters, register_counter, counters, report_value, register_reporter, start_auto_reporting

    class MyTCPHandler(SocketServer.BaseRequestHandler):
        """
        The RequestHandler class for our server.

        It is instantiated once per connection to the server, and must
        override the handle() method to implement communication to the
        client.
        """

        @shortcuts.time("requests_time")
        @shortcuts.frequency("requests_frequency")
        def handle(self):
            # self.request is the TCP socket connected to the client
            self.data = self.request.recv(1024).strip()
            print "%s wrote:" % self.client_address[0]
            print self.data

            # measure the average length of data
            report_value("requests_data_len",len(self.data))

            # just send back the same data, but upper-cased
            self.request.send(self.data.upper())

    if __name__ == "__main__":
        HOST, PORT = "localhost", 9999
        JSONFile = "/tmp/server.counters.json"

        data_len_counter = counters.TotalCounter("requests_data_len") # create the counter
        register_counter(data_len_counter) # register it, so it will start processing events

        reporter = reporters.JSONFileReporter(output_file=JSONFile)
        register_reporter(reporter)

        start_auto_reporting()


        # Create the server, binding to localhost on port 9999
        server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)

        # Activate the server; this will keep running until you
        # interrupt the program with Ctrl-C
        server.serve_forever()

---------------------------------------
Step 5 - More about Events and Counters
---------------------------------------

In the above example, the MyTCPHandler::handle method is decorated with two short functions:
:meth:`frequency <pycounters.shortcuts.frequency>` and :meth:`time <pycounters.shortcuts.time>`: . This is the easiest way
to set up PyCounters to measure things but it has some down sides. First, every shortcut decorate throws it's own events.
That means that for every execution of the handle method, four events are sent. That is inefficient. Second, and more importantly,
it also means that Counters definition are spread around the code.

In bigger projects it is better to separate event throwing from counting. For example, we can decorate the handle function with
:meth:`report_start_end <pycounters.report_start_end>`: ::

    @pycounters.report_start_end("request")
    def handle(self):
        # self.request is the TCP socket connected to the client


And define two counters to analyze 'different' statistics about this function: ::

    avg_req_time = counters.AverageTimeCounter("requests_time",events=["request"])
    register_counter(avg_req_time)

    req_per_sec = counters.FrequencyCounter("requests_frequency",events=["request"])
    register_counter(req_per_sec)

.. note:: Multiple counters with different names can be set up to analyze the same event using the events argument in their constructor.

Doing things this way has a couple of advantages:

    * It is conceptually cleaner - you report what happened and measure multiple aspects of it
    * It is more flexible - you can easily analyse more things about your code by simply adding counters.
    * You can decide at runtime what to measure (by changing registered counters)


-----------------------------------------------------
Step 6 - Another example of using Events and Counters
-----------------------------------------------------

In this example we will create a few counters listening to the same events. Let say, we want to get maximum,
minimum, average and sum of values of request data length in 15 minutes window. To achieve this, we need to
create 4 counters, all of them listening to 'requests_data_len' event.
::

    import SocketServer
    from pycounters import shortcuts, reporters, register_counter, counters, report_value, register_reporter, start_auto_reporting

    class MyTCPHandler(SocketServer.BaseRequestHandler):
        """
        The RequestHandler class for our server.

        It is instantiated once per connection to the server, and must
        override the handle() method to implement communication to the
        client.
        """

        @shortcuts.time("requests_time")
        @shortcuts.frequency("requests_frequency")
        def handle(self):
            # self.request is the TCP socket connected to the client
            self.data = self.request.recv(1024).strip()
            print "%s wrote:" % self.client_address[0]
            print self.data

            # measure the average length of data
            report_value("requests_data_len",len(self.data))

            # just send back the same data, but upper-cased
            self.request.send(self.data.upper())

    if __name__ == "__main__":
        HOST, PORT = "localhost", 9999
        JSONFile = "/tmp/server.counters.json"

        data_len_avg_counter = counters.AverageWindowCounter("requests_data_len_avg",\
            events=["requests_data_len"], window_size=900) # create the avarage window counter
        register_counter(data_len_avg_counter) # register it, so it will start processing events

        data_len_total_counter = counters.WindowCounter("requests_data_len_total",\
            events=["requests_data_len"], window_size=900) # create the window sum counter
        register_counter(data_len_total_counter)

        data_len_max_counter = counters.MaxWindowCounter("requests_data_len_max",\
            events=["requests_data_len"], window_size=900) # create the max window counter
        register_counter(data_len_max_counter)

        data_len_min_counter = counters.MinWindowCounter("requests_data_len_min",\
            events=["requests_data_len"], window_size=900) # create the min window counter
        register_counter(data_len_min_counter)

        reporter = reporters.JSONFileReporter(output_file=JSONFile)
        register_reporter(reporter)

        start_auto_reporting()


        # Create the server, binding to localhost on port 9999
        server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)

        # Activate the server; this will keep running until you
        # interrupt the program with Ctrl-C
        server.serve_forever()


You can change size of window by specifying different window_size parameter when creating a counter.

------------------------
Step 7 - Utilities
------------------------

In the example so far, we've outputted the collected metrics to a JSON file. Using that JSON file, we can easily build
simple tools to report the metrics further. The :ref:`pycounters_utils` package contains a set of utilities to help
building such tools.

At the moment, PyCounter comes with a utility to help writing `munin <http://munin-monitoring.org/>`_ plugins.
Here is an example of a munin plugin that taks the JSON report procude by the Tutorial and presents it in the way
munin understands: ::

    #!/usr/bin/python

    from pycounters.utils.munin import Plugin

    config = [
        {
            "id" : "requests_per_sec",
            "global" : {
                # graph global options: http://munin-monitoring.org/wiki/protocol-config
                "title" : "Request Frequency",
                "category" : "PyCounters example"
            },
            "data" : [
                {
                    "counter" : "requests_frequency",
                    "label"   : "requests per second",
                    "draw"    : "LINE2",
                }
            ]
        },
        {
            "id" : "requests_time",
            "global" : {
                "title" : "Request Average Handling Time",
                "category" : "PyCounters example"
            },
            "data" : [
                {
                    "counter" : "requests_time",
                    "label"   : "Average time per request",
                    "draw"    : "LINE2",
                }
            ]
        },
        {
            "id" : "requests_total_data",
            "global" : {
                "title" : "Total data processed",
                "category" : "PyCounters example"
            },
            "data" : [
                {
                    "counter" : "requests_data_len",
                    "label"   : "total bytes",
                    "draw"    : "LINE2",
                }
            ]
        }

    ]

    p = Plugin("/tmp/server.counters.json",config) # initialize the plugin

    p.process_cmd() # process munin command and output requested data or config


Try it out (after the server has run for more than 5 minutes and a report was outputted to the JSON file) by
running ``python munin_plugin config`` and ``python munin_plugin`` .

-----------------------------
Step 8 - Multiprocess support
-----------------------------

Some application (like a web server) do not run in a single process. Still, you want to collect global metrics like the
ones discussed before in this tutorial.

PyCounters supports aggreating information from multiple running processes. To do so call
:meth:`pycounters.configure_multi_process_collection` on every process you want to aggregate data from. The parameters
to this method will tell PyCounters what port to use for aggregation and, if running on multiple servers, which server
to collect data on.
