## python – Multi-core OpenCV denoising

I have ~40,000 JPEG images from the Kaggle melanoma classification competition. I created the following functions to denoise the images:

# Denoising functions
def denoise_single_image(img_path):
dst = cv2.fastNlMeansDenoising(img, 10,10,7,21)
cv2.imwrite(f'../processed_data/jpeg/{img_path}', dst)
print(f'{img_path} denoised.')

def denoise(data):
img_list = os.listdir(f'../data/jpeg/{data}')
with concurrent.futures.ProcessPoolExecutor() as executor:
tqdm.tqdm(executor.map(denoise_single_image, (f'{data}/{img_path}' for img_path in img_list)))


The denoise function uses concurrent.futures to map the denoise_single_image() function over the full list of images.

ProcessPoolExecutor() was used based on the assumption of denoising as a CPU-heavy task, rather than an I/O-intensitve task.

As it stands now, this function takes hours to run. With a CUDA-configured Nvidia GPU and 6 GB VRAM, I’d like to optimize this further.

Is there any way to improve the speed of this function?

Multi-core processing documentation
OpenCV documentation

## python – How can i use 2 numpy arrays as dataset for denoising autoencoder, and further split them into train and test sets

I have 2 numpy arrays, one with clean data (4000 x (1000)(25)) (4000 1000×25 arrays) and one with noisy data (same size as clean) to be used for a de-noising auto-encoder problem.

I want to be able to either map them and then store them into a tensorflow data set, or any other way which allows me to do this

clean(i) -> De-noising Autoencoder -> noisy(i)

Also implement a train and test split in a way that mapping remains.

I’m sorry if this is too vague, I’m new to ML and python.

## machine learning – How to do training & inference with denoising autoencoders?

Thanks for contributing an answer to Computer Science Stack Exchange!

But avoid

• Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.