Python 性能技巧 (二)

本来还想继续翻译：https://wiki.python.org/moin/PythonSpeed/PerformanceTips，但是事情太多真的没时间，那么就简单的总结一下吧。

字符串拼接

应该避免使用：

s = ""
for substring in list:
    s += substring

选择使用：

s = "".join(list)

对于大字符串列表，避免使用：

s = ""
for x in list:
    s += some_function(x)

使用：

slist = [some_function(elt) for elt in somelist]
s = "".join(slist)

避免使用：

out = "<html>" + head + prologue + query + tail + "</html>"

使用：

out = "<html>%s%s%s%s</html>" % (head, prologue, query, tail)

当然了，这种更容易读：

out = "<html>%(head)s%(prologue)s%(query)s%(tail)s</html>" % locals()

循环

列表里面的每个字符串都变成大写？

newlist = []
for word in oldlist:
    newlist.append(word.upper())

试试这种方法：

newlist = map(str.upper, oldlist))

或者是：

iterator = (s.upper() for s in oldlist)

避免点号

比如说我们想把一个列表一个大写之后到一个新列表，试试下面的：

upper = str.upper
newlist = []
append = newlist.append
for word in oldlist:
    append(upper(word))

本地变量

Python 在访问本地变量时的速度要比全局变量速度更快，比如：

def func():
    upper = str.upper
    newlist = []
    append = newlist.append
    for word in oldlist:
        append(upper(word))
    return newlist

初始化字典元素

比如你在初始化一个字典，字典中每个元素是单词和标号：

wdict = {}
for word in words:
    if word not in wdict:
        wdict[word] = 0
    wdict[word] += 1

那么试试 try-except 吧：

wdict = {}
for word in words:
    try:
        wdict[word] += 1
    except KeyError:
        wdict[word] = 1

或者再换一种方式：

wdict = {}
get = wdict.get
for word in words:
    wdict[word] = get(word, 0) + 1

头部引入

def doit1():
    import string ###### import statement inside function
    string.lower('Python')

for num in range(100000):
    doit1()

import string ###### import statement outside function
def doit2():
    string.lower('Python')

for num in range(100000):
    doit2()

doit2 要快一些。

数据聚合

import time
x = 0
def doit1(i):
    global x
    x = x + i

list = range(100000)
t = time.time()
for i in list:
    doit1(i)

print "%.3f" % (time.time()-t)

vs

import time
x = 0
def doit2(list):
    global x
    for i in list:
        x = x + i

list = range(100000)
t = time.time()
doit2(list)
print "%.3f" % (time.time()-t)

对比一下性能，第二种方法要更快一点

xrange 替代 range

恩，xrage 更快一些。

善用 cProfile

性能诊断的好工具