In computer vision, segmentation refers to the process of partitioning a digital image into multiple segments (group of meaningful pixels). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain visual characteristics. Each of the pixels in a region are similar with respect to some characteristic or computed property, such as color, intensity, or texture. Adjacent regions are significantly different with respect to the same characteristic(s). Some of the practical applications of image segmentation are:
- Medical Imaging
- Locate tumors and other pathologies
- Measure tissue volumes
- Computer-guided surgery
- Treatment planning
- Study of anatomical structure
- Locate objects in satellite images (roads, forests, etc.)
- Face recognition
- Fingerprint recognition
- Traffic control systems
- Brake light detection
- Machine vision
Several general-purpose algorithms and techniques have been developed for image segmentation. Since there is no general solution to the image segmentation problem, these techniques often have to be combined with domain knowledge in order to effectively solve an image segmentation problem for a problem domain.
Fixed circle (FC) (Eisen, 1999)
Circular mask with constant radius
Adaptive circle (AC) (Buhler et al., 2000)
Circular mask with independently estimated radius for each spot
Seeded region growing (SRG) (Yang et al., 2002)
Segmentation with seeded region growing segmentation algorithm
Mann–Whitney (MW) (Chen et al., 1997)
Computing segmentation threshold iteratively with Mann–Whitney test
k-means (KM) (Bozinov and Rahnenfu¨hrer, 2002)
k-means clustering of pixels
Hybrid k-means (HKM) (Rahnenfu¨hrer and Bozinov, 2004)
k-means clustering of pixels and removing outliers with mask matching
EM (Expectation Maximization)
Markov random field (MRF) (Demirkaya et al., 2005)
MRF modeling of pixels
Model-based segmentation (MBS) (Li et al., 2005)
Model-based clustering of pixels and extraction of connected components
Matarray (MA) (Wang et al., 2001)
Iterative modification of target mask based on spatial and intensity information
- Clustering methods
- Partition the input points into k initial sets, either randomly or based on some heuristic, then calculates the mean point, or centroid, of each set as K cluster centers
- Assign each pixel in the image to the cluster that minimizes the distance between the pixel and the cluster center
- Re-compute the cluster centers by averaging all of the pixels in the cluster
- Repeat steps 2 and 3 until convergence is attained (e.g. no pixels change clusters)
In this case, distance is the squared or absolute difference between a pixel and a cluster center. The difference is typically based on pixel color, intensity, texture, and location, or a weighted combination of these factors.
This algorithm is guaranteed to converge, but it may not return the optimal solution. The quality of the solution depends on the initial set of clusters and the value of K.
In statistics and machine learning, the k-means algorithm is clustering algorithm to partition n objects into K clusters, where K < n. It is similar to the expectation-maximization algorithm for Gaussians mixtures in that both attempt to find the centers of natural clusters in the data. In terms of performance the algorithm is not guaranteed to return a global optimum. The quality of the final solution depends largely on the initial set of clusters. Since the algorithm is extremely fast, a common method is to run the algorithm several times and return the best clustering found. A drawback of the k-means algorithm is that the number of clusters K is an input parameter. An inappropriate choice of K may yield poor results. The algorithm also assumes that the variance is an appropriate measure of cluster scatter.
- Histogram-based methods
Histogram-based methods are very efficient when compared to other image segmentation methods because they typically require only one pass through the pixels. In this technique, a histogram is computed from all of the pixels in the image, and the peaks and valleys in the histogram are used to locate the clusters in the image. Color or intensity can be used as the measure.
A refinement of this technique is to recursively apply the histogram-seeking method to clusters in the image in order to divide them into smaller clusters. This is repeated with smaller and smaller clusters until no more clusters are formed.
One disadvantage of the histogram-seeking method is that it may be difficult to identify significant peaks and valleys in the image. In this technique of image classification distance metric and integrated region matching are familiar.
Histogram-based approaches can also be quickly adapted to occur over multiple frames, while maintaining their single pass efficiency. The histogram can also be applied on a per pixel basis where the information result are used to determine the most frequent color for the pixel location. This approach segments based on active objects and a static environment, resulting in a different type of segmentation useful in Video tracking.
- Edge detection
Edge detection is a well-developed field on its own within image processing. Region boundaries and edges are closely related, since there is often a sharp adjustment in intensity at the region boundaries. Edge detection techniques have therefore been used as the base of another segmentation technique.
The challenge is, the edges identified by edge detection are often disconnected. To segment an object from an image however, one needs closed region boundaries.
- Region growing methods
The first region growing method was the seeded region growing method. This method takes a set of seeds as input along with the image. The seeds mark each of the objects to be segmented. The regions are iteratively grown by comparing all unallocated neighboring pixels to the regions. The difference between a pixel's intensity value and the region's mean, δ, is used as a measure of similarity. The pixel with the smallest difference measured this way is allocated to the respective region. This process continues until all pixels are allocated to a region.
Drawback is seeded region growing requires seeds as additional input. The segmentation results are dependent on the choice of seeds. Noise in the image can cause the seeds to be poorly placed.
Unseeded region growing is a modified algorithm that doesn't require explicit seeds. It starts off with a single region A1 – the pixel chosen here does not significantly influence final segmentation. At each iteration it considers the neighboring pixels in the same way as seeded region growing. It differs from seeded region growing in that if the minimum δ is less than a predefined threshold T then it is added to the respective region Aj. If not, then the pixel is considered significantly different from all current regions Ai and a new region An + 1 is created with this pixel.
- Graph partitioning methods
Graph partitioning methods can effectively be used for image segmentation. In these methods, the image is modeled as a weighted, undirected graph. Usually a pixel or a group of pixels are associated with nodes and edge weights define the (dis)similarity between the neighborhood pixels. The graph (image) is then partitioned according to a criterion designed to model "good" clusters. Each partition of the nodes (pixels) output from these algorithms are considered an object segment in the image. Some popular algorithms of this category are normalized cuts, random walker, minimum cut, isoperimetric partitioning and minimum spanning tree-based segmentation.
- Watershed transformation
The watershed transformation considers the gradient magnitude of an image as a topographic surface. Pixels having the highest gradient magnitude intensities (GMIs) correspond to watershed lines, which represent the region boundaries. Water placed on any pixel enclosed by a common watershed line flows downhill to a common local intensity minimum (LIM). Pixels draining to a common minimum form a catch basin, which represents a segment.
- Model based segmentation
The central assumption of such an approach is that structures of interest/organs have a repetitive form of geometry. Therefore, one can seek for a probabilistic model towards explaining the variation of the shape of the organ and then when segmenting an image impose constraints using this model as prior. Such a task involves (i) registration of the training examples to a common pose, (ii) probabilistic representation of the variation of the registered samples, and (iii) statistical inference between the model and the image. State of the art methods in the literature for knowledge-based segmentation involve active shape and appearance models, active contours and deformable templates and level-set based methods.
- Neural networks segmentation
Neural Network segmentation relies on processing small areas of an image using an artificial neural network or a set of neural networks. After such processing the decision-making mechanism marks the areas of an image accordingly to the category recognized by the neural network. A type of network designed especially for this is the Kohonen map.
Pulse-Coupled Neural Networks (PCNNs) are neural models proposed by modeling a cat's visual cortex and developed for high-performance biomimetic image processing. In 1989, Eckhorn introduced a neural model to emulate the mechanism of cat's visual cortex. The Eckhorn model provided a simple and effective tool for studying small mammal's visual cortex, and was soon recognized as having significant application potential in image processing. In 1994, the Eckhorn model was adapted to be an image processing algorithm by Johnson, who termed this algorithm as Pulse-Coupled Neural Network. Over the past decade, PCNNs have been utilized for a variety of image processing applications, including: image segmentation, feature generation, face extraction, motion detection, region growing, noise reduction, etc.
A PCNN is a two-dimensional neural network. Each neuron in the network corresponds to one pixel in an input image, receiving its corresponding pixel's color information (e.g. intensity) as an external stimulus. Each neuron also connects with its neighboring neurons, receiving local stimuli from them. The external and local stimuli are combined in an internal activation system, which accumulates the stimuli until it exceeds a dynamic threshold, resulting in a pulse output. Through iterative computation, PCNN neurons produce temporal series of pulse outputs. The temporal series of pulse outputs contain information of input images and can be utilized for various image processing applications, such as image segmentation and feature generation. Compared with conventional image processing means, PCNNs have several significant merits, including robustness against noise, independence of geometric variations in input patterns, capability of bridging minor intensity variations in input patterns, etc.
- Open source software
Several open source software packages are available for performing image segmentation
- ITK - Insight Segmentation and Registration Toolkit (Open Source)
- MITK has a program module for manual segmentation
- OpenCV is a computer vision library originally developed by Intel.
- Fiji - Fiji is just ImageJ, an image processing package which includes different segmentation plug-ins.
- AForge.NET - an open source C# framework.