Python数值比较的效率
不同python实现的效率比较
1.取出内层容器的多个值
2.字符串去掉结尾(开头)字符
3. in 操作要用集合
总结
Python数值比较的效率Python 数值比较运算效率:>,<,==,!=,>=和<=
python数值比较运算有6种,分别为>,<,==,!=,>=和 <=。他们的运算效率如何?采用哪种方式最高效?本文通过使用timeit来测试比较运算的效率。
程序如下:
import timeit
def func1():
for i in range(100000):
if i > 0:
k = 2
def func2():
for i in range(100000):
if i < 0:
k = 2
def func3():
for i in range(100000):
if i == 0:
k = 2
def func4():
for i in range(100000):
if i != 0:
k = 2
def func5():
for i in range(100000):
if i >= 0:
k = 2
def func6():
for i in range(100000):
if i <= 0:
k = 2
if __name__ == '__main__':
func1()
func=[func1,func2,func3,func4,func5,func6]
op = [">","<","==","!=",">=","<="]
for j in range(6):
v = 0
timer = timeit.Timer(func[j])
v+= timer.timeit(number=1000)
print(op[j],":",v)
这是只有if语句的情况,结果如下:
> | 3.2038074 |
< | 2.7034741 |
== | 2.6940471000000006 |
!= | 3.285996800000001 |
>= | 3.205210300000001 |
<= | 2.6961838999999994 |
加上else语句则:
> | 3.2270024 |
< | 3.2400326 |
== | 3.2511219999999996 |
!= | 3.1877201999999993 |
>= | 3.2120345000000015 |
<= | 3.2339978999999985 |
一般情况下,第一个分支比较节省时间。第二个分支会耗时稍微多一些。
不同python实现的效率比较 1.取出内层容器的多个值如果要从嵌套的列表中获取内层列表每个索引对应的最大(或最小值),有两种方法:
import time
import random
a = [[random.randint(0, 1000) for i in range(10)] for j in range(100000)]
def method_x(a):
"""每个索引位置一个生成器表达式"""
begin = time.time()
b = min(i[0] for i in a)
c = min(i[1] for i in a)
d = min(i[2] for i in a)
e = min(i[3] for i in a)
f = min(i[4] for i in a)
g = min(i[5] for i in a)
h = min(i[6] for i in a)
i = min(i[7] for i in a)
j = min(i[8] for i in a)
k = min(i[9] for i in a)
print(time.time()-begin)
def method_y(a):
"""只循环一次算出各个索引对应的值"""
begin = time.time()
b,c,d,e,f,g,h,i,j,k = 100,100,100,100,100,100,100,100,100,100
for t in a:
b = min(t[0], b)
c = min(t[1], c)
d = min(t[2], d)
e = min(t[3], e)
f = min(t[4], f)
g = min(t[5], g)
h = min(t[6], h)
i = min(t[7], i)
j = min(t[8], j)
k = min(t[9], k)
print(time.time()-begin)
结果
2.字符串去掉结尾(开头)字符>>> method_x(a*10)
1.1728243827819824
>>> method_y(a*10)
2.1234960556030273
去除字符串结尾字符,批量操作的话,一般使用 rstrip() 函数,但是这个函数效率不如直接索引快。
import random
import time
# a为10万个长度是11位的字符串列表;b为10万长度为9位的字符串列表;
a = [f'{random.randint(10,100)}xxxyyyzzz' for i in range(100000)]
b = [f'{random.randint(100000,110000)}xyz' for i in range(100000)]
def test1(a, str_cut): # replace
b = time.time()
c = [i.replace(str_cut, '') for i in a]
print(time.time()-b)
def test2(a, str_cut): # rstrip()
b = time.time()
c = [i.rstrip(str_cut) for i in a]
print(time.time()-b)
def test3(a, str_cut): # 索引
b = time.time()
x =len(str_cut)
c = [i[:-x] for i in a]
print(time.time()-b)
结果比较,当想去掉字符长度大于保留的长度的时候,rstrip() 效率趋近于 replace() , 想去掉的字符长度小于保留部分时,rstrip() 趋近于直接索引。
3. in 操作要用集合>>> test1(a*10, 'xxxyyyzzz')
0.2882061004638672
>>> test2(a*10, 'xxxyyyzzz')
0.2662053108215332
>>> test3(a*10, 'xxxyyyzzz')
0.16613411903381348>>> test1(b*10, 'xyz')
0.2721879482269287
>>> test2(b*10, 'xyz')
0.1911303997039795
>>> test3(b*10, 'xyz')
0.1501011848449707
按一样的逻辑写了两版程序,运行时间确差了好多,一步一步找,发现是 in 判断后面用的容器类型不一样。
a = range(0, 100000)
b = list(a)
c = set(a)
def test(a):
t = time.time()
c = 0
for i in range(0, 100000, 13):
if i in a:
c += 1
print(c)
print(time.time()-t)
测试时间,差距极大:
>>> test(b)
7693
5.649996280670166
>>> test(a)
7693
0.0019681453704833984
每次判断之前把列表转换为集合,能改进运行的效率:
def test(a):
t = time.time()
c = 0
a = set(a)
for i in range(0, 100000, 13):
if i in a:
c += 1
print(c)
print(time.time()-t)
>>> test(b)
7693
0.005988359451293945
4. 内置的max()效率低
def getmax(a, b):
if a >= b:
return a
return b
定义一个求最大值的函数,再用random模块提前创造一个长度100的data_list用于测试(random本身耗时高,会让比较效果不明显)。
def main():
t = time.time()
for a, b in data_list*10000:
max(a, b)
print(time.time()-t)
def main2():
t = time.time()
for a, b in data_list*10000:
getmax(a, b)
print(time.time()-t)
自定义的函数比使用内置的max()快了近一倍。
>>> main1()
0.2231442928314209
>>> main2()
0.14011740684509277
计算三个数中的最大值时也是这样。
总结以上为个人经验,希望能给大家一个参考,也希望大家多多支持软件开发网。