Tuesday, February 20, 2018

Error handling within a flask application

Any website should somehow be able to deal with unexpected errors. In this blog post, I will describe how I handle errors within my Flask application, which follows this great tutorial.

In Flask, it is quite easy to register error handlers which re-direct to a custom error page like this
from flask import render_template

@app.errorhandler(404)
def page_not_found(e):
    return render_template('404.html'), 404
However, rather than going through the list of possible errors and creating a route and page for each, one can also create a general error handler which handles all errors. To do that you have to register a general error handler in your app like this
from werkzeug.exceptions import default_exceptions

def init_app(app):
    ''' Function to register error_handler in app '''
    for exception in default_exceptions:
        app.register_error_handler(exception, error_handler)

    app.register_error_handler(Exception, error_handler) 
Calling the init_app(app) function will register the error_handler() function for all exceptions in the default_exceptions list of the werkzeug package, which should cover all errors except for some exotic ones. Now we just have to write the error_handler() function.
def error_handler(error):
    ''' Catches all errors in the default_exceptions list '''
    msg = "Request resulted in {}".format(error)
    if current_user.is_authenticated:
        current_app.logger.error(msg, exc_info=error)
    else:
        current_app.logger.warning(msg, exc_info=error)

    if isinstance(error, HTTPException):
        description = error.get_description(request.environ)
        code = error.code
        name = error.name
    else:
        description = ("We encountered an error "
                       "while trying to fulfill your request")
        code = 500
        name = 'Internal Server Error'

    templates_to_try = ['errors/error{}.html'.format(code), 'errors/generic_error.html']
    return render_template(templates_to_try,
                           code=code,
                           name=Markup(name),
                           description=Markup(description),
                           error=error), code
This function first checks whether the error has been caused by an authenticated user, and if so it writes an error message to the logfile. If the user was not authenticated, it only writes a warning message. The reason for this distinction is that robots cause all sorts of 404 errors in my app, which I don't care much about, but if a user causes an exception, I definitely want to know about it.

In the next step, the handler extracts all information it can get from the error message and then renders a template. The render_template function will go through the templates_to_try list until it finds an existing template, so this way you can register custom pages for certain errors but if a custom error does not exist, it will just render the general error page, which in my case looks like this
{% extends "master.html" %}

{% block title %}Error{% endblock %}

{% block description %}
<meta name="description" content="Error message.">
{% endblock %}

{% block body %}

    <div id="navbar_wrapper">
        <div id="site_content">
            <div class="container">
                <div class="col-xs-12 col-sm-9 col-md-10 col-lg-10">
                    <br>
                    <h1>{{ code }}:{{ name }}</h1>
                    <p>{{description}}</p>
                    <p>The administrator has been notified. Sorry for the inconvenience!</p>
                    <button class="btn btn-primary" onclick="history.back(-1)">Go Back</button> 
                </div>
            </div>
        </div>
    </div>

{% endblock %}
The advantage of such a custom error page is that you can add a back button, to get the users back on your page if you haven't alienated them yet with the error.

I put the entire example on Github. Let me know if you have any comments or questions.
cheers
Florian


Friday, February 16, 2018

Understanding the super() function in Python

In this blog post, I will explain how the super() function in Python works. The super function is a built-in Python function and can be used within a class to gain access to inherited methods from a parent class that has been overwritten.

So let's look at an example. Assume we want to build a dictionary class which has all properties of dict, but additionally allows us to write to logger. This can be done by defining a class which inherits from the dict class and overwrites the functions with new functions which do the same as the old, but additionally have the logging call. The new dict class would look like this
class LoggingDict(dict):
    def __setitem__(self, key, value):
        logging.info("Setting key %s to %s" % (key, value))
        super().__setitem__(key, value)
    def __getitem__(self, key):
        logging.info("Access key %s" % key)
        super().__getitem__(key)
Here we overwrite the __getitem__ and __setitem__ methods with new ones, but we just want to add a logging functionality and keep the functionality of the original functions. Note that we do not need super() to do this since we could get the same result with
class LoggingDict(dict): 
    def __setitem__(self, key, value): 
        logging.info("Setting key %s to %s" % (key, value))
        dict.__setitem__(self, key, value)
    def __getitem__(self, key): 
        logging.info("Access key %s" % key)
        dict.__getitem__(self, key)
The advantage of super() is that, should you decide that you want to inherit from a different class, you would only need to change the first line of the class definition, while the explicit use of the parent class requires you to go through the entire class and change the parent class name everywhere, which can become quite cumbersome for large classes. So super() makes your code more maintainable.

However, the functionality of super() gets more complicated if you inherit from multiple classes and if the function you refer to is present in more than one of these parent classes. Since the parent class is not explicitly declared, which parent class is addressed by super()?

The super() function considers an order of the inherited classes and goes through that ordered list until it finds the first match. The ordered list is known as the Method Resolution Order (MRO). You can see the MRO of any class as
>>> dict.__mro__
(<type 'dict'>, <type 'object'>)
The use of the MRO in the super() function can lead to very different results in the case of multiple inheritances, compared to the explicit declaration of the parent class. Let's go through another example where we define a Bird class which represents the parent class for the Parrot class and the Hummingbird class:
class Bird(object): 
    def __init__(self): 
        print("Bird init") 

class Parrot(Bird):
    def __init__(self):
        print("Parrot init")
        Bird.__init__(self) 

class Hummingbird(Bird):
    def __init__(self): 
        print("Hummingbird init")
        super(Hummingbird, self).__init__()
Here we used the explicit declaration of the parent class in the Parrot class, while in the Hummingbird class we use super(). From this, I will now construct an example where the Parrot and Hummingbird classes will behave differently because of the super() function.

Let's create a FlyingBird class which handles all properties of flying birds. Non-flying birds like ostriches would not inherit from this class:
class FlyingBird(Bird):
    def __init__(self):
        print("FlyingBird init")
        super(FlyingBird, self).__init__()
Now we produce child classes of Parrot and Hummingbird, which specify specific types of these animals. Remember, Hummingbird uses super, Parrot does not:
class Cockatoo(Parrot, FlyingBird):
    def __init__(self):
        print("Cockatoo init")
        super(Cockatoo, self).__init__()

class BeeHummingbird(Hummingbird, FlyingBird):
    def __init__(self):
        print("BeeHummingbird init")
        super(BeeHummingbird, self).__init__()
If we now initiate an instance of Cockatoo we will find that it will not call the __init__ function of the FlyingBird class
>>> Cockatoo() 
Cockatoo init 
Parrot init 
Bird init 
while an initiation of a BeeHummingbird instance does
>>> BeeHummingbird()
BeeHummingbird init 
Hummingbird init 
FlyingBird init 
Bird init 
To understand the order of calls you might want to look at the MRO
>>> print(BeeHummingbird.__mro__)
(<class 'BeeHummingbird'>, <class 'Hummingbird'>, <class 'FlyingBird'>,
<class 'Bird'>, <type 'object'>)
This is an example where not using super() is causing a bug in the class initiation since all our Cockatoo instances will miss the initiation functionality of the FlyingBird class. It clearly demonstrates that the use of super() goes beyond just avoiding explicit declarations of a parent class within another class.

Just as a side note before we finish, the syntax for the super() function has changed between Python 2 and Python 3. While the Python 2 version requires an explicit declaration of the arguments, as used in this post, Python 3 now does all this implicitly, which changes the syntax from (Python 2)
super(class, self).method(args)
to
super().method(args)
I hope that was useful. Let me know if you have any comments/questions. Note that there are very useful discussions of this topic on stack-overflow and in this blog post.
cheers
Florian

Wednesday, February 7, 2018

Python best coding practices: Seven tips for better code

The Python software foundation makes detailed recommendations about naming conventions and styles in the PEP Style Guide. Here is a short summary of that fairly long document, picking seven points I found useful.

The suggested coding and naming conventions sometimes make the code more robust, but often are just for the purpose of making the code more readable. Keep in mind that usually, readability is the most critical aspect of a business level codebase, since any improvement first requires understanding the existing code.

1. A good Python coding practice is to use the floor operator // wherever possible. For example, I used to index arrays and lists like this
nums[int(N/2)]
but this can be done much faster with the floor operator
nums[N//2]
You can test that with
import time
time0 = time.time()
for i in range(0, 10000):
    5//2
print(time.time()-time0)
which in my case gave $0.0006051$, while my old approach
import time
time0 = time.time()
for i in range(0, 10000):
    int(5/2)
print(time.time()-time0)
takes $0.002234$. The reason why the floor operator is faster is that it is pre-calculated, while the usual division is not. Read more about it here.

2.  Next, how can we test whether a variable x is None or not. The following four cases could be used
# bad
if x:

# better
if x != None:

# better
if not x is None:

# best
if x is not None:
The recommended case is the last version. Case 1 is dangerous. While None is mapped to false in a boolean context, many other values also convert to false (like '0' or empty strings). Therefore, if it is really None which you want to test for, you should explicitly write it. So while avoiding case 1 makes your code more robust, avoiding 2 and 3 is just following coding conventions in Python.

The difference between the unequal (!=) and 'is not' operators is subtle, but important. The != operator tests whether the variable has a different value than None, while the 'is not' operator tests whether the two variables point to different objects. In the case above it makes no difference, but still the best practice, in this case, is to use the 'is not' operator.

Following from the example above there are also recommendations for how to test boolean values. If x is True or False, test for it like this
if x:
but not 
if x == True: 
or
if x is True:

3. Another case which makes your code more readable is the use of startswith() and endswith() instead of string splicing. For example use
if foo.startswith('bar'):
instead of
if foo[:3] == 'bar':
Both versions give the same result, and both versions should be robust, but the first version is deemed more readable.

4. To check for types use the isinstance() function
if isinstance(x, int):
rather than the 'is' or == operator
if type(x) == int:

if type(x) is int:
The type operator can easily give you the wrong answer. Take the following example
class MyDict(dict): 
  pass
x = MyDict()
print("type = ", type(x))
print(type(x) == dict)
print(isinstance(x, dict))
which gives the output
('type = ', <class '__main__.MyDict'>)
False
True
Even though the MyDict class behaves just like a dictionary, since it inherited all its properties, the type operator does not see it that way, while the isinstance() function notices that this class has all the functionality of a dict.

5. When using try-except statements always catch exceptions by name
try: 
    x = a/b
except ValueError: 
    #do something
You can use try-except statements to catch all possible exceptions, for example, to ensure that the user gets clean feedback but if you use general try-except statements, you usually write sloppy code, since you have not thought about what exceptions could happen. If you use general except statements, catch the exception and write it to a log file.

6. To break up long lines the preferred methods is to use brackets and to line up the broken lines
if first_condition and second_condition and third_condition and fourth_condition:
should be
if (first_condition and second_condition and
    third_condition and fourth_condition):

7. Finally, you should follow naming conventions:
  • Class and Exception names should use capitalized words without underscore like GroupName() or ExceptionName
  • Function, module, variable and method names should be lowercase, with words separated by underscores like process_text() and global_var_name
  • Constants are written in all capital letters with underscores separating words like MAX_OVERFLOW and TOTAL
A good way to enforce PEP style standards in your project is to use a tool like pylint. Pylint can easily be hooked into a git project, preventing commits which do not follow these standards. In my experience, this is a very good idea when working in a big team with very different experience levels.

I hope this summary was useful. Let me know if you have any questions/comments below.
cheers
Florian