Unable to cast Python instance to C++ type of TensorRT 8.4 when running INT8 calibration on GPU A100 #3871

yjiangling · 2024-05-16T10:47:42Z

When I try to conduct INT8 quantilization in Python, it always give the following error during the calibration procedure:

[05/16/2024-18:22:28] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2904, GPU 74855 (MiB)
[05/16/2024-18:22:28] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2904, GPU 74863 (MiB)
[05/16/2024-18:22:28] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 2904, GPU 74839 (MiB)
[05/16/2024-18:22:28] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2904, GPU 74847 (MiB)
[05/16/2024-18:22:28] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +16, now: CPU 130, GPU 272 (MiB)
[05/16/2024-18:22:28] [TRT] [I] Starting Calibration.
[ERROR] Exception caught in get_batch(): Unable to cast Python instance to C++ type (compile in debug mode for details)
[05/16/2024-18:22:30] [TRT] [I] Post Processing Calibration data in 2.704e-06 seconds.
[05/16/2024-18:22:30] [TRT] [E] 1: Unexpected exception _Map_base::at
Failed to create the engine

How can I fix it? The get_batch() function in Calibration instance are programmed like this:

class ASRCalibrator(trt.IInt8EntropyCalibrator2):
	def __init__(self, calibration_files=[], batch_size=1, cache_file="", preprocess_func=None):
		super().__init__()
		self.cache_file = cache_file
		self.batch_size = batch_size
		self.files = calibration_files
		self.batch = (None, None)

		self.batches = self.load_batches()
		self.preprocess_func = preprocess_func

	def get_batch_size(self):
		return self.batch_size

	def load_batches(self):
		for filename in self.files:
			self.batch = self.preprocess_func(filename)
			yield self.batch

	def get_batch(self, names):
		try:
			batch = next(self.batches)
			data, data_len = batch

			device_input0 = cuda.mem_alloc(data.nbytes)
			device_input1 = cuda.mem_alloc(data_len.nbytes)

			# 把校准数据从CPU搬运到GPU中
			cuda.memcpy_htod(device_input0, data.ravel())
			cuda.memcpy_htod(device_input1, data_len.ravel())

			return [(device_input0, data.shape), (device_input1, data_len.shape)]

		except StopIteration:
			return []

	def read_calibration_cache(self):
		# 如果校准表文件存在则直接从其中读取校准表
		if os.path.exists(self.cache_file):
			with open(self.cache_file, "rb") as f:
				return f.read()

	def write_calibration_cache(self, cache):
		# 如果进行了校准，则把校准表写入文件中以便下次使用
		with open(self.cache_file, "wb") as f:
			f.write(cache)
			f.flush()

The text was updated successfully, but these errors were encountered:

yjiangling · 2024-05-17T09:48:50Z

@rmccorm4 Hi, I write the get_batch() function followed by your instruction in issue: https://github.com/NVIDIA/TensorRT/issues/688, but it still get the Error: RuntimeError: Unable to cast Python instance to C++ type (compile in debug mode for details), could you please help me to checkout what's wrong? Thank you very much!

liyuli1997 · 2024-11-21T08:40:55Z

When I try to conduct INT8 quantilization in Python, it always give the following error during the calibration procedure:

[05/16/2024-18:22:28] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2904, GPU 74855 (MiB) [05/16/2024-18:22:28] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2904, GPU 74863 (MiB) [05/16/2024-18:22:28] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 2904, GPU 74839 (MiB) [05/16/2024-18:22:28] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2904, GPU 74847 (MiB) [05/16/2024-18:22:28] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +16, now: CPU 130, GPU 272 (MiB) [05/16/2024-18:22:28] [TRT] [I] Starting Calibration. [ERROR] Exception caught in get_batch(): Unable to cast Python instance to C++ type (compile in debug mode for details) [05/16/2024-18:22:30] [TRT] [I] Post Processing Calibration data in 2.704e-06 seconds. [05/16/2024-18:22:30] [TRT] [E] 1: Unexpected exception _Map_base::at Failed to create the engine

How can I fix it? The get_batch() function in Calibration instance are programmed like this:
class ASRCalibrator(trt.IInt8EntropyCalibrator2):
	def __init__(self, calibration_files=[], batch_size=1, cache_file="", preprocess_func=None):
		super().__init__()
		self.cache_file = cache_file
		self.batch_size = batch_size
		self.files = calibration_files
		self.batch = (None, None)

		self.batches = self.load_batches()
		self.preprocess_func = preprocess_func

	def get_batch_size(self):
		return self.batch_size

	def load_batches(self):
		for filename in self.files:
			self.batch = self.preprocess_func(filename)
			yield self.batch

	def get_batch(self, names):
		try:
			batch = next(self.batches)
			data, data_len = batch

			device_input0 = cuda.mem_alloc(data.nbytes)
			device_input1 = cuda.mem_alloc(data_len.nbytes)

			# 把校准数据从CPU搬运到GPU中
			cuda.memcpy_htod(device_input0, data.ravel())
			cuda.memcpy_htod(device_input1, data_len.ravel())

			return [(device_input0, data.shape), (device_input1, data_len.shape)]

		except StopIteration:
			return []

	def read_calibration_cache(self):
		# 如果校准表文件存在则直接从其中读取校准表
		if os.path.exists(self.cache_file):
			with open(self.cache_file, "rb") as f:
				return f.read()

	def write_calibration_cache(self, cache):
		# 如果进行了校准，则把校准表写入文件中以便下次使用
		with open(self.cache_file, "wb") as f:
			f.write(cache)
			f.flush()

+1

yjiangling closed this as completed May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to cast Python instance to C++ type of TensorRT 8.4 when running INT8 calibration on GPU A100 #3871

Unable to cast Python instance to C++ type of TensorRT 8.4 when running INT8 calibration on GPU A100 #3871

yjiangling commented May 16, 2024

yjiangling commented May 17, 2024

liyuli1997 commented Nov 21, 2024

Unable to cast Python instance to C++ type of TensorRT 8.4 when running INT8 calibration on GPU A100 #3871

Unable to cast Python instance to C++ type of TensorRT 8.4 when running INT8 calibration on GPU A100 #3871

Comments

yjiangling commented May 16, 2024

yjiangling commented May 17, 2024

liyuli1997 commented Nov 21, 2024