Computer Vision

Training computers to interpret and understand the visual world 
using digital images and videos

What is Computer Vision?

Computer vision’s goal is not only to see, but also process and provide useful results based on the observation. For example, a computer could create a 3D image from a 2D image, such as those in cars, and provide important data to the car and/or driver. For example, cars could be fitted with computer vision (CV) which would be able to identify and distinguish objects on and around the road such as traffic lights, pedestrians, traffic signs and so on, and act accordingly.

The intelligent device could provide inputs to the driver or even make the car stop if there is a sudden obstacle on the road. When a human who is driving a car sees someone suddenly move into the path of the car, the driver must react instantly. In a split second, human vision has completed a complex task, that of identifying the object, processing data and deciding what to do. Computer vision’s aim is to enable computers to perform the same kind of tasks as humans with the same efficiency.

Common Tools and Libraries


OpenCV has C++, Python, Java and MATLAB interfaces and supports Win-dows. Written natively in C++ and has templates to work with STL containers.

Developer's Resource:


BoofCV is organized into several packages: image pro-cessing, features, geometric vision, calibration, recognition,visualize, and IO. Common alternatives to OpenCV.

Developer's Resource:


SimpleCV is an open source
framework,lets you work with the images orvideo streams that come from webcams, Kinects, FireWire and IP cameras, ormobile phones.

Developer's Resource:


Tesseract is an optical character recognition engine for various operating sys-tems. It is mostly written in C, and then some more was written in C++.

Developer's Resource:


TensorFlow gives you the flexibility and control with features like the Keras Functional API and Model Subclassing API for creation of complex topologies. For easy prototyping and fast debugging, use eager execution.

Developer's Resource:

Microsoft Azure Computer Vision API

Microsoft Azure Computer Vision API is a out-of-the-box api that returns in-formation about visual content found in an image, extract recognized words from an image to a machine readable text, and analyze video in near real-time.

Developer's Resource:

100E Use Cases

  1. A med-tech company uses computer vision to assess chronic wounds from images from smartphone camera
    1. Technologies used: TensorFlow/MobileNets
  2. A healthcare company uses computer vision to perform medical fraud detection
    1. Technologies Used: TensorFlow/MobileNets
  3. A semiconductor company uses computer vision to optimize the locations of circuit components on a semiconductor chip placement
    1. Technologies Used: TensorFlow/MobileNets, OpenCV

Open Datasets


Image database organized according to the WordNet hierarchy


Database of handwritten digits with training set of 60,000 examples, and test set of 10,000 examples.


Database of labeled subsets of the 80 million tiny images dataset

Google Open Images

Dataset of labelled~9M images with image-level labels, object bounding boxes and visual relationships

Caltech 101

Database of pictures of objects belonging to 101 categories. There is about 40 to 800 images per category.


Large-scale object detection, segmentation, and captioning dataset with over 330K images and 1.5M object instances

Related Articles

  1. Counting Road Traffic Capacity with OpenCV
    1. Link to article:
  2. Face Recognition with Python (OpenCV)
    1. Link to article:
  3. Image Segmentation Using Color Spaces in OpenCV
    1. Link to article: