Wednesday, February 7, 2018

Python best coding practices: Seven tips for better code

The Python software foundation makes detailed recommendations about naming conventions and styles in the PEP Style Guide. Here is a short summary of that fairly long document, picking seven points I found useful.

The suggested coding and naming conventions sometimes make the code more robust, but often are just for the purpose of making the code more readable. Keep in mind that usually, readability is the most critical aspect of a business level codebase, since any improvement first requires understanding the existing code.

1. A good Python coding practice is to use the floor operator // wherever possible. For example, I used to index arrays and lists like this
nums[int(N/2)]
but this can be done much faster with the floor operator
nums[N//2]
You can test that with
import time
time0 = time.time()
for i in range(0, 10000):
    5//2
print(time.time()-time0)
which in my case gave $0.0006051$, while my old approach
import time
time0 = time.time()
for i in range(0, 10000):
    int(5/2)
print(time.time()-time0)
takes $0.002234$. The reason why the floor operator is faster is that it is pre-calculated, while the usual division is not. Read more about it here.

2.  Next, how can we test whether a variable x is None or not. The following four cases could be used
# bad
if x:

# better
if x != None:

# better
if not x is None:

# best
if x is not None:
The recommended case is the last version. Case 1 is dangerous. While None is mapped to false in a boolean context, many other values also convert to false (like '0' or empty strings). Therefore, if it is really None which you want to test for, you should explicitly write it. So while avoiding case 1 makes your code more robust, avoiding 2 and 3 is just following coding conventions in Python.

The difference between the unequal (!=) and 'is not' operators is subtle, but important. The != operator tests whether the variable has a different value than None, while the 'is not' operator tests whether the two variables point to different objects. In the case above it makes no difference, but still the best practice, in this case, is to use the 'is not' operator.

Following from the example above there are also recommendations for how to test boolean values. If x is True or False, test for it like this
if x:
but not 
if x == True: 
or
if x is True:

3. Another case which makes your code more readable is the use of startswith() and endswith() instead of string splicing. For example use
if foo.startswith('bar'):
instead of
if foo[:3] == 'bar':
Both versions give the same result, and both versions should be robust, but the first version is deemed more readable.

4. To check for types use the isinstance() function
if isinstance(x, int):
rather than the 'is' or == operator
if type(x) == int:

if type(x) is int:
The type operator can easily give you the wrong answer. Take the following example
class MyDict(dict): 
  pass
x = MyDict()
print("type = ", type(x))
print(type(x) == dict)
print(isinstance(x, dict))
which gives the output
('type = ', <class '__main__.MyDict'>)
False
True
Even though the MyDict class behaves just like a dictionary, since it inherited all its properties, the type operator does not see it that way, while the isinstance() function notices that this class has all the functionality of a dict.

5. When using try-except statements always catch exceptions by name
try: 
    x = a/b
except ValueError: 
    #do something
You can use try-except statements to catch all possible exceptions, for example, to ensure that the user gets clean feedback but if you use general try-except statements, you usually write sloppy code, since you have not thought about what exceptions could happen. If you use general except statements, catch the exception and write it to a log file.

6. To break up long lines the preferred methods is to use brackets and to line up the broken lines
if first_condition and second_condition and third_condition and fourth_condition:
should be
if (first_condition and second_condition and
    third_condition and fourth_condition):

7. Finally, you should follow naming conventions:
  • Class and Exception names should use capitalized words without underscore like GroupName() or ExceptionName
  • Function, module, variable and method names should be lowercase, with words separated by underscores like process_text() and global_var_name
  • Constants are written in all capital letters with underscores separating words like MAX_OVERFLOW and TOTAL
A good way to enforce PEP style standards in your project is to use a tool like pylint. Pylint can easily be hooked into a git project, preventing commits which do not follow these standards. In my experience, this is a very good idea when working in a big team with very different experience levels.

I hope this summary was useful. Let me know if you have any questions/comments below.
cheers
Florian

No comments:

Post a Comment