numpy - 如何使用python numpy计算分位数?

我在寻找类似于Excel百分比函数的东西。

时间:

你可能对sciPy Stats软件包感兴趣,它有你要的百分比函数,还有许多其他的统计工具。

numpy中的percentile()可用


import numpy as np
a = np.array([1,2,3,4,5])
p = np.percentile(a, 50) # return 50th percentile, e.g median.
print p
3.0

百分位数函数的纯Python实现,


## {{{ http://code.activestate.com/recipes/511478/ (r1)
import math
import functools

def percentile(N, percent, key=lambda x:x):
"""
 Find the percentile of a list of values.

 @parameter N - is a list of values. Note N MUST BE already sorted.
 @parameter percent - a float value from 0.0 to 1.0.
 @parameter key - optional key function to compute value from each element of N.

 @return - the percentile of the values
"""
 if not N:
 return None
 k = (len(N)-1) * percent
 f = math.floor(k)
 c = math.ceil(k)
 if f == c:
 return key(N[int(k)])
 d0 = key(N[int(f)]) * (c-k)
 d1 = key(N[int(c)]) * (k-f)
 return d0+d1

# median is 50th percentile.
median = functools.partial(percentile, percent=0.5)
## end of http://code.activestate.com/recipes/511478/ }}}


import numpy as np
a = [154, 400, 1124, 82, 94, 108]
print np.percentile(a,95) # gives the 95th percentile

你可以使用更简单的函数。


def percentile(N, P):
"""
 Find the percentile of a list of values

 @parameter N - A list of values. N must be sorted.
 @parameter P - A float value from 0.0 to 1.0

 @return - The percentile of the values.
"""
 n = int(round(P * len(N) + 0.5))
 return N[n-1]

# A = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
# B = (15, 20, 35, 40, 50)
#
# print percentile(A, P=0.3)
# 4
# print percentile(A, P=0.8)
# 9
# print percentile(B, P=0.3)
# 20
# print percentile(B, P=0.8)
# 50

如果你希望从列表获得百分比值,请使用以下:


def percentile(N, P):
 n = int(round(P * len(N) + 0.5))
 if n > 1:
 return N[n-2]
 else:
 return 0

检查scipy.stats模块:


 scipy.stats.scoreatpercentile

在不使用numpy的情况下,只使用python计算百分比。


import math

def percentile(data, percentile):
 size = len(data)
 return sorted(data)[int(math.ceil((size * percentile) / 100)) - 1]

p5 = percentile(mylist, 5)
p25 = percentile(mylist, 25)
p50 = percentile(mylist, 50)
p75 = percentile(mylist, 75)
p95 = percentile(mylist, 95)

...