官术网_书友最值得收藏!

Comprehensions and generators

In this section, we will explore a few simple strategies to speed up Python loops using comprehension and generators. In Python, comprehension and generator expressions are fairly optimized operations and should be preferred in place of explicit for-loops. Another reason to use this construct is readability; even if the speedup over a standard loop is modest, the comprehension and generator syntax is more compact and (most of the times) more intuitive.

In the following example, we can see that both the list comprehension and generator expressions are faster than an explicit loop when combined with the sum function:

    def loop(): 
res = []
for i in range(100000):
res.append(i * i)
return sum(res)

def comprehension():
return sum([i * i for i in range(100000)])

def generator():
return sum(i * i for i in range(100000))

%timeit loop()
100 loops, best of 3: 16.1 ms per loop
%timeit comprehension()
100 loops, best of 3: 10.1 ms per loop
%timeit generator()
100 loops, best of 3: 12.4 ms per loop

Just like lists, it is possible to use dict comprehension to build dictionaries slightly more efficiently and compactly, as shown in the following code:

    def loop(): 
res = {}
for i in range(100000):
res[i] = i
return res

def comprehension():
return {i: i for i in range(100000)}
%timeit loop()
100 loops, best of 3: 13.2 ms per loop
%timeit comprehension()
100 loops, best of 3: 12.8 ms per loop

Efficient looping (especially in terms of memory) can be implemented using iterators and functions such as filter and map. As an example, consider the problem of applying a series of operations to a list using list comprehension and then taking the maximum value:

    def map_comprehension(numbers):
a = [n * 2 for n in numbers]
b = [n ** 2 for n in a]
c = [n ** 0.33 for n in b]
return max(c)

The problem with this approach is that for every list comprehension, we are allocating a new list, increasing memory usage. Instead of using list comprehension, we can employ generators. Generators are objects that, when iterated upon, compute a value on the fly and return the result.

For example, the map function takes two arguments--a function and an iterator--and returns a generator that applies the function to every element of the collection. The important point is that the operation happens only while we are iterating, and not when map is invoked!

We can rewrite the previous function using map and by creating intermediate generators, rather than lists, thus saving memory by computing the values on the fly:

    def map_normal(numbers):
a = map(lambda n: n * 2, numbers)
b = map(lambda n: n ** 2, a)
c = map(lambda n: n ** 0.33, b)
return max(c)

We can profile the memory of the two solutions using the memory_profiler extension from an IPython session. The extension provides a small utility, %memit, that will help us evaluate the memory usage of a Python statement in a way similar to %timeit, as illustrated in the following snippet:

    %load_ext memory_profiler
numbers = range(1000000)
%memit map_comprehension(numbers)
peak memory: 166.33 MiB, increment: 102.54 MiB
%memit map_normal(numbers)
peak memory: 71.04 MiB, increment: 0.00 MiB

As you can see, the memory used by the first version is 102.54 MiB, while the second version consumes 0.00 MiB! For the interested reader, more functions that return generators can be found in the itertools module, which provides a set of utilities designed to handle common iteration patterns.

主站蜘蛛池模板: 高安市| 巴林右旗| 平原县| 青铜峡市| 都安| 南乐县| 六盘水市| 广东省| 葫芦岛市| 嘉黎县| 贺州市| 甘德县| 肇庆市| 台湾省| 阜宁县| 晋州市| 武陟县| 唐山市| 嘉兴市| 临江市| 宕昌县| 芒康县| 高要市| 获嘉县| 古田县| 如东县| 蓬莱市| 南宁市| 雅江县| 洛扎县| 通化市| 新巴尔虎左旗| 佛学| 贞丰县| 晴隆县| 浦城县| 麻栗坡县| 无锡市| 修文县| 镇远县| 龙南县|