This machine-learning method could assist with robotic scene understanding, image editing, or online recommendation systems. It can identify all pixels in an image representing a given material, which is shown in a pixel selected by the user.
The method is accurate even when objects have varying shapes and sizes, and the machine-learning model they developed isn’t tricked by shadows or lighting conditions that can make the same material appear different.
Although they trained their model using only “synthetic” data, which are created by a computer that modifies 3D scenes to produce many varying images, the system works effectively on real indoor and outdoor scenes it has never seen before. The approach can also be used for videos; once the user identifies a pixel in the first frame, the model can identify objects made from the same material throughout the rest of the video.
During experiments, the researchers found that their model could predict regions of an image that contained the same material more accurately than other methods. When they measured how well the prediction compared to ground truth, meaning the actual areas of the image that are comprised of the same material, their model matched up with about 92 percent accuracy.
In the future, they want to enhance the model so it can better capture fine details of the objects in an image, which would boost the accuracy of their approach.
Source: MIT news