object - 在Python , 如何确定的对象的大小 ?

在C中,可以得到一个int,char的大小,我想知道如何获取对象的大小,如字符串,整数等。

时间:

只需使用sys模块中定义的sys.getsizeof函数。

sys.getsizeof(object[, default])

以字节为单位返回对象的大小,对象可以是任何类型的对象,allobject将返回正确的结果,但是对于第三方扩展,这不一定成立,因为它是特定于实现的。

to参数允许定义一个值,如果对象类型不提供检索大小并导致的值,则返回。

如果对象由垃圾收集器管理,getsizeof调用对象方法的__sizeof__并添加额外的垃圾收集器开销。

用法例子,在python 3.0中:


>>> import sys
>>> x = 2
>>> sys.getsizeof(x)
14
>>> sys.getsizeof(sys.getsizeof)
32
>>> sys.getsizeof('this')
38
>>> sys.getsizeof('this also')
48

如果你在python

对于numpy数组,getsizeof不工作-对于某些原因,它总是返回40,原因如下:


from pylab import *
from sys import getsizeof
A = rand(10)
B = rand(10000)

然后(在ipython中):


In [64]: getsizeof(A)
Out[64]: 40

In [65]: getsizeof(B)
Out[65]: 40


In [66]: A.nbytes
Out[66]: 80

In [67]: B.nbytes
Out[67]: 80000


import sys

try: print sys.getsizeof(object)
except AttributeError:
 print"sys.getsizeof exists in Python ≥2.6"


下面是我写的一个快速脚本,输出所有变量的列表大小,


for i in dir():
 try:
 print (i, eval(i).nbytes )
 except:
 print (i, sys.getsizeof(eval(i)) )

如何在python中确定对象的大小?

答案是"只使用sys.getsizeof"不是完全完整的答案。

这可以直接对内置对象进行操作,但是它并不考虑包含的内容,特别是数字和字符串。


Bytes type empty + scaling notes
24 int NA
28 long NA
37 str + 1 byte per additional character
52 unicode + 4 bytes per additional character
56 tuple + 8 bytes per additional item
72 list + 32 for first, 8 for each additional
232 set sixth item increases to 744; 22nd, 2280; 86th, 8424
280 dict sixth item increases to 1048; 22nd, 3352; 86th, 12568
64 class inst has a __dict__ attr, same scaling as dict above
16 __slots__ class with slots has no dict, seems to store in 
 mutable tuple-like structure.
120 func def doesn't include default args and other attrs
904 class def has a proxy __dict__ structure for class attrs
104 old class makes sense, less stuff, has real dict though.

)

为了覆盖大多数类型,我编写了这个递归函数来估计大多数python对象的大小:


import sys
import numbers
import collections

def getsize(obj):
 # recursive function to dig out sizes of member objects: 
 def inner(obj, _seen_ids = set()):
 obj_id = id(obj)
 if obj_id in _seen_ids:
 return 0
 _seen_ids.add(obj_id)
 size = sys.getsizeof(obj)
 if isinstance(obj, (basestring, numbers.Number, xrange)):
 pass # bypass remaining control flow and return 
 elif isinstance(obj, (tuple, list, set, frozenset)):
 size += sum(inner(i) for i in obj)
 elif isinstance(obj, collections.Mapping) or hasattr(obj, 'iteritems'):
 size += sum(inner(k) + inner(v) for k, v in obj.iteritems())
 else:
 attr = getattr(obj, '__dict__', None)
 if attr is not None:
 size += inner(attr)
 return size
 return inner(obj)

我随便测了测(其实应该对它进行单元测试):


>>> getsize(['a', tuple('bcd'), Foo()])
344
>>> getsize(Foo())
16
>>> getsize(tuple('bcd'))
194
>>> getsize(['a', tuple('bcd'), Foo(), {'foo': 'bar', 'baz': 'bar'}])
752
>>> getsize({'foo': 'bar', 'baz': 'bar'})
400
>>> getsize({})
280
>>> getsize({'foo':'bar'})
360
>>> getsize('foo')
40
>>> class Bar():
... def baz():
... pass
>>> getsize(Bar())
352
>>> getsize(Bar().__dict__)
280
>>> sys.getsizeof(Bar())
72
>>> getsize(Bar.__dict__)
872
>>> sys.getsizeof(Bar.__dict__)
280

在类定义和函数定义上有点小问题,因为没有追踪它们的所有属性。

...