Computer vision is a program that allows a computer to recognize objects on images (and videos, which are simply a sequence of images). For example, computer vision can determine that a picture depicts a cat, like on this coloring page. How does it do this? Using a virtual neural network - almost like in our brain, only shaped like a pyramid with a pointed top. Imagine that a cat photo lies under the bottom of the pyramid. If you look at the photo very closely, you'll see that it consists of millions of dots of different colors. The neurons at the bottom of the pyramid read the colors of the dots below them and transmit signals to other neurons higher up and those neurons transmit signals to the next, until the signals reach the highest neuron at the top of the pyramid. If this neuron activates - it means that there's a cat on the photo, if it doesn't - then there isn't. Of course, at first, our neural network will make a lot of mistakes. But if after each mistake we slightly adjust the connections between neurons and do this a million times (and show the neural network a million pictures of different cats), it will learn to recognize any cat in photographs very well.
Add comment











