It can be said that the greatest moment in the evolution of modern man was the invention of tools. Our results show that these approaches can be successfully applied to image databases with large defects.
Contributions of the thesis
Another drawback of a fully automated classification system is that bringing human knowledge into the loop can be difficult. In CBIR the user wants to find a certain type of image from a possibly very large database.
Outline of the thesis
The system then tries to find the target image using only the information obtained from the images themselves, without prior annotations.
Author’s contributions to publications
- Taking the image
- Feature extraction
- Training a CBIR system
The most common example is infrared imaging, which can be used to "see" the temperatures of objects. Therefore we have to use lower level descriptors that give us the statistical properties of the image and its different regions (Sonka et al., 1999).
A word about difficulty
Goals of surface inspection
The benefits of surface inspection and quality assurance in general can be roughly divided into three different cases. Of course, there are several other ways to combine inspection tasks, see for example (Newman and Jain, 1995). If a factory is supposed to produce red balls and produces blue balls instead, there is usually something wrong with the system.
For example, if we notice that the amount of defects suddenly doubles or triples, there is probably something wrong with the production process. If it is known from experience that major production failures are usually followed by certain types of defects, preventive measures can be taken whenever these defects are detected.
Problems of surface inspection
A proposed solution
An experienced person has a better understanding of the whole system than a classifier trained with a bunch of prototype vectors. We propose that content-based image retrieval (CBIR) can be applied to the problem of visual quality inspection. Second, we want to give the machine operator a good overview of the overall state of the system.
Let's take as an example that an operator has a number of images of defects that he considers serious. He can combine this information with his knowledge of the paper machine to refine the process, eliminating defects in the future.
Content-based image retrieval
Applications and uses
Some existing CBIR systems
One of the first true CBIR systems was QBIC (Flickner et al., 1995) which was developed by IBM in the early 90s. The basic idea of the picture book (Pentland et al., 1994) is that there should be a set of semantic features that describe an image, similar to how letters form words and sentences. PicHunter (Cox et al., 2000) improves the CBIR process by applying a strict Bayesian framework to the problem.
Those interested in the implementation details of a CBIR system should examine VIPER (Squire et al., 1999), whose entire source code is available under the GNU GPL (Free Software Foundation, 1991).
It selects those images that are mapped to positive nodes of the map and are as far away from negative areas as possible. PicSOM can also combine results from several different TS-SOMs simply by selecting those images with the best overall performance. The selected images are then displayed to the user, who again selects those similar to what they are looking for.
Content-based retrieval results with PicSOM
- Data sets
- Used features
- Adapting to user goals
- Feature pruning
- Retrieval efficiency
The region-based (RS) format uses a set of 35 Angular Radial Transform (ART) coefficients that are calculated within a disk centered at the center of the Y channel of the image. One of the benefits of CBIR we've discussed is the easy way to bring a human expert into the loop. By examining the metadata of the returned images, it can assess the severity of the query image.
Precision indicates what percentage of images returned by PicSOM are "correct", that is, in the class we are querying. When the precision is greater than the a priori probability of the class, we know that the CBIR system is working well.
Clustering is an ill-posed problem
Most analysis methods such as clustering, nearest neighbor classification and support vector machines (Vapnik, 2000) are based on finding out which data vectors are close to each other and which are distant. Simple search Find out if the data set contains a vector identical to x. This is mostly a problem in relational databases where the elements ofx can contain non-numeric data such as strings.
Range search Finds all those data vectors di whose elements are within certain bounds related to tox. It is one of the simplest classification algorithms, but still produces very good results in practice.
Simple nearest neighbor search
Data space partitioning
No matter how we place a hyperball in this space, we cannot exactly follow the edges of the solution. Now we can safely discard all the elements in group (2) and focus only on groups (1) and (3). Even the most efficient convex hull algorithms, such as QHull (Barber et al., 1996), become exponentially slower as the data dimension grows.
A simple attempt to visualize these limits in one, two, and three dimensions shows how the difficulty of the problem grows as a function of dimension. Even algorithms that do not divide the space into separate regions, such as the coordinate sorting method (Friedman et al., 1975), become extremely slow as the data dimension increases.
A look into classical indexing methods
Although this is a very interesting and challenging area of research, we believe it is beyond the scope of this dissertation and we refer interested readers to it (Chávez et al., 2001). If the current region has too many data elements, it is divided into four subregions. When a node overlaps too much with others, it is replaced by a linear search structure called a supernode.
While the methods discussed above partition the data with isooriented hyperplanes, the Binary Spatial Partition (BSP) tree (Fuchs et al., 1980) uses freely oriented hyperplanes. Although we have described several algorithms, a brief overview can only scratch the surface.
Problems with large data sets
SOM is an unsupervised data analysis method that tries to mimic the cognitive processes in the brain. Here, "further away" is defined as distance along the grid instead of in the input space. The resulting grid is not regular like in SOM, making it easier to adapt to complex data manifolds.
The growth occurs in the direction of largest error amplitude, as this is usually an indication of the convolution of the map to represent a higher dimensional data manifold. The resulting grids are two-dimensional and regular as in SOM, but they are not rectangular in shape.
Growing Self-Organizing Map (GSOM) (Alahakoon et al., 2000) starts with a small SOM (usually 2×2) and grows outward. High-Dimensional Growing Self-Organizing Map (HDGSOM) (Amarasiri et al., 2004) is an improvement of the GSOM algorithm to better handle high-dimensional data. Tree-structured self-organizing map (TS-SOM) (Koikkalainen and Oja, 1990; Koikkalainen, 1994) is a simple hierarchical extension of the SOM.
A self-organizing tree map (SOTM) (Kong and Guan, 1998) is similar to a growing network, but forms a hierarchical tree structure instead of a flat network. A self-organizing tree (S-Tree) (Campos and Carpenter, 2001) is a binary tree that grows in much the same way as an ETree.
Applications to content-based retrieval
The sample queries show that the system works, but there are no performance numbers in the paper. That is, their interconnections and locations in the data space define how the system explains the data. Its job is to position the nodes in the data space so that we get the best possible explanation for the data.
In the data space, the child nodes are initially placed in the same place as their parent node. Now we can replace grid spacing with tree spacing and can thus use all the same training formulas as in SOM.
Optimizing leaf node locations
7.6) By adding this and multiplying by the epoch size N, we find the amount of calculations required for a single epoch. During a single round, the training factor is only defined by the tree distance, so only a fixed subset of leaf nodes near the BMU is updated. On the other hand, people seem to intuitively choose a larger b if their dataset dimension is large, presumably because there is "more space".
Experience tells us that the growth rate is sub-linear and most likely logarithmic. This yields O(d·logd) complexity, although we can achieve a linear complexity if we simply adjust independently of gd.
Benefits and disadvantages
The root nodes that connect leaf nodes together give us an estimate of the data topology. In Proceedings of the 3rd Workshop on Self-Organizing Maps, Advances in Self-Organizing Maps, pages 140–145, Lincoln, England. Investigation of alternative strategies and quality measures for controlling the growth process of the growing hierarchical self-organizing map.
In SIGMOD ’84: Proceedings of the 1984 ACM SIGMOD conference on International Management of Data, pages 47–57, New York, NY, USA. Get ahead with the tree-structured self-organization map. G., editor, Proceedings of the 11th European Conference on Artificial Intelligence, pages 211–215, Amsterdam, The Netherlands.
Whenever analysis methods are used, their user must understand how the algorithm works.
Data topology estimation
It should be noted that the leaf nodes of an ETree may contain only a few data vectors depending on the parameters, so individual estimates may be somewhat imprecise.
Adaptation to data
The availability of the package enables other people to more easily build on our work, which is one of the basic principles of science. In van der Malsburg, C., von Seelen, W., Vorbrüggen, J. C., and Sendhoff, B., editors, Proceedings of the International Conference on Artificial Neural Networks, number 1112 in Lecture Notes in Computer Science, pages 581–586, Bochum, Germany. Proceedings of 1990 International Joint Conference on Neural Networks, volume II, pages 279–284, San Diego, CA. Himself.
In Proceedings of the IAPR Conference on Machine Vision Applications (MVA 2005), pages 112–115, Tsukuba Science City, Japan.