Basically, MemuConvert is an image processing method. The aim is decreasing the labelling process effort in object detection.
Why I Developed The Method ?
I joined a competition with a team and our task was detecting cars with data from a drone. We tried this task in YOLOV3 with its own dataset. However, this was a failure because of the inadequate dataset. Therefore we understood that we had to train an AI from scratch.
In object detection, there are 3 steps: collecting the data, labeling them ,and training stage. The main problem was the high amount of labeled data need. There are three resolutions, training an AI with other AI, using websites such as scale.ai ,and using algorithms. We did not have an other AI for training and we did not want to spend money because we wanted to make difference in competition. Therefore, we searched algorithms but the problem was same, the process is based on manpower. This problem occurs because for each image, new threshold should be defined. For this reason, I had to limit the threshold.
There are two types of data can be got from image, colors ,and shapes. Each has own problems. For instance, shapes algorithms can not find most of the objects such as face or car because they are sum of other elements. We call “car” if it has simply, body and tires. Since most of the objects are inferential results, shape algorithms can not find the result. It finds all of the inference one by one. Sometimes, it also finds artificial expressions such as skyline. Thus, the new method should be based on colors but mostly, images has undesired pixels in desired colors.
In A1, the building and taxi have same color palette, hence those are two same objects for color algorithms. Therefore, I should have got rid of those extra color.
- To get a color from original image for drawing boundary boxes.
- To divide an image into rectangles.
- Sponge Test
- To find dominant color.
- To rebuild each rectangles with its own dominant color.
- To creating a new image with those rectangles.
- To transform image into black and white. (desired color is white and undesired color is black.)
- To bulur image for getting rid of noise.
- To draw boundary box.
Sponge test is physical description of what I did in an image. Imagine a sponge with a single red dot. If you squeeze that sponge you will get rid of the redness on the sponge. In conclusion, this physical application gives the dominant color, which gives an opportunity for rebuilding the object with dominant color.
Kimage: It represents the number of divided rectangles in an image with sponge test.
Kxcolor: It represents dominant color of specific rectangle.
In the beginning, I fixed Kimage to 1024 but in low quality images, Kimage does not work. Thus, I fixed Kimage to x length of the image. However, some images are not convenient for equal rectangles. For this reason, some rectangles overlap on the each rectangle. Therefore, to get rid of those rectangles, image is blurred.
All of the images have same threshold!
The method gives an opportunity to train object detection AI fast because users do not have to change threshold. This makes it a general algorithm to label all of the images with a specific threshold. However, the result of being general algorithm, it does not label all of the images precisely.