I have a facial recognition algorithm that compares two images A and B and returns the probability that they match.
I also have 50,000 images and would like to sort these images into groups.
Here is the immediate way I thought to do this:
- Start with image 0, compare it to all 49,999 images. Store the similarities in a table
- Go to image 2, compare it to 49,998 images. Store the similarities in a table
At the end of all this I have left with a
Verified_listimages that match, and I can basically feed them into a network to combine them, ie if I have a
Verified_list is the same length as the number of images, so
Verified_list contains references to all images with which image_1 matches)
then the network graph combines them into:
Indicating that I have two groups.
Obviously, the treatment is huge, I think it equates to: nCr (50000.2), a huge number!
I would like to add logic to improve that. I can think of what needs to be done, but I do not know how to implement it.
Basically, I just want to jump as many images as possible without reducing the overall quality of the images.
Say I got to picture 4. I can do a loop
Verified_list to know if image_4 is image_1, but it seems to me that if I get to image 39,543, I'd have to check all the 39,542
listes_vérifiées before determining whether image_39543 has been associated with a group or not
I have the impression of complicating a little too much this question … Is there a name for this kind of problem? Is there a better known way to do it?
There is also the problem where the recognition algorithm is not 100% accurate. That is, img_1 is img_2 and img_5, but img_5 is only img_1 and not img_2.