Archive

Archive for the ‘python’ Category

UTC now → timestamp → UTC time

January 25, 2015 2 comments

Problem
Say you have an application and you want to store the date/time of an event. Later you want to see this date/time.

Solution
Take the current UTC date/time and convert it to a timestamp, which is an integer. Store this value. Then read it and convert it back to UTC date/time.

>>> import datetime
>>>
>>> utcnow = datetime.datetime.utcnow()
>>> utcnow
datetime.datetime(2015, 1, 25, 18, 10, 41, 803198)
>>> ts = int(utcnow.timestamp())
>>> ts
1422205841
>>> datetime.datetime.fromtimestamp(ts)
datetime.datetime(2015, 1, 25, 18, 10, 41)
>>>
Categories: python Tags: , , ,

[mongodb] get a random document from a collection

January 24, 2015 Leave a comment

Problem
From a MongoDB collection, you want to get a random document.

Solution

import random

def get_random_doc():
    # coll refers to your collection
    count = coll.count()
    return coll.find()[random.randrange(count)]

Pymongo documentation on cursors: here.

Categories: python Tags:

opening gzipped JSON files

January 7, 2015 Leave a comment

Problem
I have a project where the input JSON file is almost 7 MB. I keep this project in my Dropbox folder, so that 7 MB text file seems to be a waste. Any way to reduce its size?

Solution
I zipped it up with gzip: “gzip -9 input.json“. This command produced a 1.3 MB “input.json.gz” file and deleted the original. Good. But how to open it in Python?

Normal way (without gzip):

import json

with open("input.json") as f:
    d = json.load(f)

Compressed way (with gzip):

import json
import gzip

with gzip.open("input.json.gz", "rb") as f:
    d = json.loads(f.read().decode("ascii"))

I didn’t notice any performance penalty. The application that first reads this json file starts as fast as before.

Categories: python Tags: ,

pyvenv: create virtual environmets for Python 3.4+

January 1, 2015 Leave a comment

For creating virtual environmets, I’ve used virtualenvwrapper so far. However, Python 3.4 contains the command pyvenv that does the same thing. Since it also installs pip in the virt. env., it can replace virtualenvwrapper.

I like to store my virtual environments in a dedicated folder, separated from the project directory. virtualenvwrapper, by default, stores the virt. env.’s in the ~/.virtualenvs folder. Since I got used to this folder, I will continue to keep my virt. env.’s in this folder.

pyvenv
Say we have our project folder here: ~/python/webapps/flasky_project. Create a virt. env. for this the following way:

pyvenv ~/.virtualenvs/flasky_project

It will create a Python 3 virt. env.

virtualenv / virtualenvwrapper
For the sake of completeness, I also write here how to create virt. env.’s with virtualenv and virtualenvwrapper:

# blog post: http://goo.gl/oEdtT3

# virtualenvwrapper for Python 3 or Python 2
mkvirtualenv -p `which python3` myenv3
mkvirtualenv -p `which python2` myenv2

# virtualenv for Python 3 or Python 2
virtualenv -p python3 myproject3
virtualenv -p python2 myproject2

# When the env. is created, activate it
# and launch the command python within.
# Verify if it's the correct version.

TL; DR
If you use Python 3.4+ and you need a virt. env., use the command “pyvenv“.

2014 in review

December 30, 2014 Leave a comment

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here's an excerpt:

The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 190,000 times in 2014. If it were an exhibit at the Louvre Museum, it would take about 8 days for that many people to see it.

Click here to see the complete report.

Categories: python Tags: ,

XML to dict / XML to JSON

December 29, 2014 Leave a comment

Problem
You have an XML file and you want to convert it to dict or JSON.

Well, if you have a dict, you can convert it to JSON with “json.dump()“, so the real question is: how to convert an XML file to a dictionary?

Solution
There is an excellent library for this purpose called xmltodict. Its usage is very simple:

import xmltodict

# It doesn't work with Python 3! Read on for the solution!
def convert(xml_file, xml_attribs=True):
    with open(xml_file) as f:
        d = xmltodict.parse(f, xml_attribs=xml_attribs)
        return d

This worked well under Python 2.7 but I got an error under Python 3. I checked the project’s documentation and it claimed to be Python 3 compatible. What the hell?

The error message was this:

Traceback (most recent call last):
  File "/home/jabba/Dropbox/python/lib/jabbapylib2/apps/xmltodict.py", line 247, in parse
    parser.ParseFile(xml_input)
TypeError: read() did not return a bytes object (type=str)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./xml2json.py", line 27, in <module>
    print(convert(sys.argv[1]))
  File "./xml2json.py", line 17, in convert
    d = xmltodict.parse(f, xml_attribs=xml_attribs)
  File "/home/jabba/Dropbox/python/lib/jabbapylib2/apps/xmltodict.py", line 249, in parse
    parser.Parse(xml_input, True)
TypeError: '_io.TextIOWrapper' does not support the buffer interface

I even filed an issue ticket :)

After some debugging I found a hint here: you need to open the XML file in binary mode!

XML to dict (Python 2 & 3)
So the correct version that works with Python 3 too is this:

import xmltodict

def convert(xml_file, xml_attribs=True):
    with open(xml_file, "rb") as f:    # notice the "rb" mode
        d = xmltodict.parse(f, xml_attribs=xml_attribs)
        return d

XML to JSON (Python 2 & 3)
If you want JSON output:

import json
import xmltodict

def convert(xml_file, xml_attribs=True):
    with open(xml_file, "rb") as f:    # notice the "rb" mode
        d = xmltodict.parse(f, xml_attribs=xml_attribs)
        return json.dumps(d, indent=4)
Categories: python Tags: , , , , , ,

catch the output of pprint in a string

December 29, 2014 Leave a comment

Problem
If you have a large nested data structure (e.g. a list or dictionary), the pprint module is very useful to nicely format the output.

However, “pprint.pprint” prints directly to the stdout. What if you want to store the nicely formatted output in a string?

Solution
Use “pprint.pformat” instead.

Example:

>>> import pprint
>>> d = {"one": 1, "two": 2}
>>> pprint.pprint(d)
{'one': 1, 'two': 2}
>>> s = pprint.pformat(d)
>>> s
"{'one': 1, 'two': 2}"

Well, this is a small example, the real pretty formatting is not visible, but you get the point :)

Categories: python Tags:

fancy text tables

December 28, 2014 Leave a comment

Problem
Instead of simply printing some data on the screen, I wanted to put them in a nicely formatted ASCII table.

Solution
After some research I found a nice package for this purpose: python-tabulate. It supports both Python 2 and Python 3 (yes, from now on it’s also important for me).

Its usage is very simple. Here is a snippet that creates random usernames and passwords:

from tabulate import tabulate

table = []
headers = ["Username #1", "Username #2", "Password #1", "Password #2"]
for _ in range(10):
    name1 = get_username_1()
    name2 = get_username_2()
    pass1 = get_password_1(8)
    pass2 = get_password_2(12)
    table.append([name1, name2, pass1, pass2])
#        print("{:15}{:15}{:15}{:15}".format(name1, name2, pass1, pass2))    # this is the past :)
print(tabulate(table, headers=headers, tablefmt="psql"))

Output:

+---------------+---------------+---------------+---------------+
| Username #1   | Username #2   | Password #1   | Password #2   |
|---------------+---------------+---------------+---------------|
| Adarah        | hasana        | ygyQsF6u      | uTzPqZMDNJ6x  |
| Alary         | begahi        | YqW4aY7q      | ipZuX0sX2RFg  |
| Solita        | otomot        | Xwliu9yi      | IjeFibVFaoZq  |
| Casony        | rikari        | fw6dk5gt      | zbAXO8gd33Lh  |
| Anne          | asakou        | MXsXpz43      | aYNiJTwojULG  |
| Joby          | mgomam        | vZjiCuyT      | qc3Q9caAenJw  |
| Kallita       | aremon        | j1ZD1QU9      | AIEsykmYodfy  |
| Cara          | iumina        | 75UzkKgK      | lK92GdAxn441  |
| Fuscie        | goomio        | uof2C7ct      | HFgVlAZ9PSmv  |
| Dean          | utinon        | gycncz9f      | 61oJzUGdDVKf  |
+---------------+---------------+---------------+---------------+

The module supports various formatting styles. For more examples, check out the official page.

Update (20191029)

The project has moved to https://github.com/astanin/python-tabulate .

Categories: bash, python Tags: , ,

Static HTML file browser for Dropbox

December 2, 2014 3 comments

Two of my students worked on a project that creates static HTML files for a public Dropbox folder (find it at github). I use it in production, check it out here.

If you created your Dropbox account before October 4, 2012, then you are lucky and you have a Public folder. Accounts opened after this date have no Public folder :(

So, if you have a Public folder and you want to share the content of a folder recursively, then you can try this script. It produces a file and directory list that is similar to an Apache output.

Screenshot

Authors and Contributors

Categories: python Tags: , , ,

make a script run under Python 2.x and 3.x too

November 3, 2014 Leave a comment

Problem
I installed Manjaro Linux on one of my laptops, just to try something new. I’ve been using it for a week and I like it so far :) On my older laptop it runs smoother than Ubuntu.

Anyway, Manjaro switched to Python 3.x, that’s the default, thus “python” points to Python 3. I use Ubuntu on my other machines where Python 2 is the default. I would like to modify my scripts (at least some of them) to run on both systems.

For instance, in Python 2.x you call “raw_input”, while this function was renamed to “input” in Python 3.x.

Solution
Well, since January 2014 I start all my new scripts with this line:

from __future__ import (absolute_import, division,
                        print_function, unicode_literals)

It ensures a nice transition from Python 2 to Python 3.

To solve the “raw_input” problem, you can add these lines:

import sys

if sys.version_info >= (3, 0):
    raw_input = input

You can continue using “raw_input”, but if it’s executed with Python 3.x, “raw_input” will point to the “input” function.

Of course, the ideal solution would be to switch to Python 3, but I’m not ready for that yet :)

Update (20141228)
At the moment I’m updating my jabbapylib library. The new version will be released soon :) Since it’s a library, it should work with both Python 2 and Python 3. When I write a new script, I tend to use Python 3 these days, but a library is different. A library should support both Python 2.x and 3.x. The most widely used solution is the Six compatibility library, which is a joy to use. To solve the raw_input issue for instance, just import the line

from six.moves import input

Then — just like in Python 3 — call the function “input()” to read from the standard input. For more info. read the official docs.

Categories: python Tags: , , ,
Design a site like this with WordPress.com
Get started