Images are an immensely important part of today’s digital world of communication. Initially image search according to user interest were performed through text and tags associated with images and this was known as text based image retrieval. But this has changed and today, image retrieval is carried out according to visual contents of image which is known as content based image retrieval (CBIR). Countless computer applications exist which involve image searching, image matching and image retrieval, for example trade mark or logo searching, signature matching, documents matching and so on. Due to the multiplicity of applications of content based image retrieval, there always remained room for further improvement and accordingly the topic has attracted tremendous amount of research. More and more efforts are made to bring about improvements in the area since the last few decades.
CBIR presents the concept of image searching by inputting an image as query image and extracting out those images from the repository which are similar in content to the query image. Image features include shape, texture and color or combination of these. Color and texture features are more close to human perception and are high level features whereas shape is a structural and is, therefore, considered a low level feature. Performance issues include retrieval time, good balance of precision and recall and semantic gap. All these issues are highly dependent on feature extraction process. An effective feature extraction method produces successful results. It has been observed that among all the features, shape is the most powerful because it identifies the shapes present in an image. Relevance feedback is another technique used for interactive image retrieval or enhancing user experiences. In short, all features have their own roles and advantages and by making a combination of these features, their individual advantages and characteristics can be utilized collectively. A combination of multiple features like shape, color and relevance feedback can produce remarkable retrieval results [1-3]. In the above backdrop, a combinational approach of multiple features of image has been adopted in this paper. Shape description has been used as a primary step whereas color and relevance feedback have been added as support for the sake of user perception. Some related work of combining multiple features for content based image retrieval has been given in Section 2.
The order of the succeeding sections is arranged like this: existing work is presented in Section 2, proposed work is introduced in Section 3, results are presented in Section 4, conclusion is drawn in Section 5 and references are given in Section 6.
2. Related Work
Many researchers have made efforts to improve CBIR. Of such efforts, some signal works are spelled out here. An effective method for image retrieval by combining multiple features of color, texture and shape was presented in . An algorithm for image retrieval was proposed in  which combined color and texture features. Here, circular region energy was used to index images in which color feature was extracted by low frequency band of wavelet transform and texture feature was extracted through high frequency band of wavelet transform.  addressed the problems faced in multiple features indexing into a database. In  a combination of shape and color were used and features were extracted after getting the region of interest (ROI) segmentation.  combined shape and color and obtained color features by HSI Hue value and shape information by curvature scale space (CSS). Relevance feedback was combined with texture, shape and color in  where color information was obtained by cumulative histogram, texture was determined by color co-occurrence matrix (CCM) and shape information was described by edge detection. A combination of texture, color and shape was also used in  in an optimized manner. Some specialized examples of multiple features image retrieval could also be found in [11,12].
As this study follows a combinational approach for feature selection, therefore related work presented in this section has been selected from the techniques where multiple features are combined. The proposed approach presents a unique and novel method for defining features; worked out from low level instead of direct application of already existing feature definitions. The proposed technique uses least square polynomial to estimate curves in image and self defined coding mechanism for shape description, self defined max-min average for color description and a modified relevance feedback. In this way, a detection of image content is done which retrieves similar images more accurately and effectively compared to the related existing techniques. In the next section, the proposed method is discussed in greater detail.
3. Proposed Work
Generally, the task of content based image retrieval (CBIR) is divided into such steps as feature extraction, database population, query representation, similarity measurement and retrieving similar images as shown in Fig. 1. The proposed method follows this general flow and uses the combination of shape, color and relevance feedback for feature extraction. Feature extraction is the backbone of any content based image retrieval system. If features which index the image correctly are extracted properly; the whole system performs efficiently and accurately. The following sub-sections describe these steps at length.
Fig. 1.General Flow of CBIR
3.1 Image Segmentation
Image segmentation is a process of classifying and isolating objects. It is being used as a key operation in a number of applications. In this process, objects of interest are extracted and further tasks are performed on these objects rather than manipulating the whole image. Image segmentation is performed through color image segmentation in Matlab 7.0. An image, with its segmented image, has been shown in Fig. 2.
Fig. 2.Image Segmentation
3.2 Key Point Determination
For the determination of key points, second order partial derivatives are used. These derivatives determine points where slopes are different from the surrounding points.
Definition: An image is treated as function f in terms of equation y = f (x). On the graph of y = f (x) let P(x, y) and Q(x + Δx, y + Δy) be distinct points near to each other. If θ is the angle that the secant line PQ makes with the x-axis then:
As Δx approaches 0 , the point Q moving along the graph of y = f (x) approaches P, the chord PQ approaches the tangent line PT in its limiting position and measure θ of angle MAQ approaches Ψ=m∠OTP hence taking limits as Δx→0, so equation (3) becomes:
i.e., the derivative of function f at point P represents the slope of tangent to the curve y = f (x) at that point as shown in Fig. 3.
Fig. 3.Slop Change Measurement
The derivative of f(x) may also create derivative on [a,b]. By applying the definition of derivative to f’ (x), the resulting limit is called the second derivative of y = f (x) and is denoted by:
Fig. 4 shows an image with key points.
Fig. 4.Image with Key Points
3.3 Curve Fitting
For curve fitting, least square polynomial method has been used due to its simplicity and accuracy in estimation. To make the process simpler, only three key points are selected from a set of key points Objects from Fig. 4 have been shown with key points in Fig. 5.
Fig. 5.Objects with Key Points from Image shown in Fig. 4
The general equation of least square polynomial for a given set of points is given by:
To estimate the constants c0,c1,c2, the following equations are formed:
In matrix notation the above equations can be summarized as:
Solving the above equation gives three values for c0,c1,c2 and by substituting these values into the equations (8), (9) and (10); curves are estimated for the given key points. There are hundreds of curves but for simplicity and ease, only curves and lines shown in Table 1 have been deduced. The estimated points for given key points are then plotted on the graph and the curves are coded under the values shown in Table 1.
Table 1.Coding of Estimated Shapes
To clarify this method the objects in Fig. 5 have been coded and shown in Fig. 6. In this way all images are processed and populated into the database. Following facts have been used in order to characterize the shapes.
When y is constant; straight line is parallel to x - axis
When x is constant; straight line is parallel to y - axis
When straight line contains both x and y values which are changing, it is at some angle with x - axis.
If a curve has constant values of y and smaller values of x than these in its starting and ending point, the curve is faced up.
If a curve has constant values of y and values of x greater than these in its starting and ending point, the curve is faced down.
If a curve has constant values of x values of y smaller than these in its starting and ending point, the curve is faced right.
If a curve has constant values of x and values of y greater than these in its starting and ending point, the curve is faced left.
The next section describes how color features are estimated to ensure accurate image matching and retrieval.
Fig. 6.Objects according to Defined Shapes
3.4 Color Features Estimation
Generally color features are said to be high level features as they are nearer to user perception and vision system while shape features are considered low level features. Here color features are taken as secondary level features. Color features are estimated after processing shape features for every resident image. Selection of retrieved images is based on shape feature descriptor discussed above and then sorted with respect to color values in order to serve nearer retrieval results to user according to the query image. For color feature estimation, we start from top left corner of image and proceed from left to right and top to bottom pixel by pixel, selecting 3×3 neighborhoods for each pixel under consideration and located at center. From the neighborhood, mean value of maximum and minimum values are obtained by virtue of the following expressions:
where μn shows the mean value for maximum and minimum values of neighborhood. After scanning the whole image and obtaining mean values of maximum and minimum values for each pixel of its neighborhood, a single mean value is obtained according to the following expression:
These color features take color distribution into account in the whole image.
3.5 Relevance Feedback
In real world implementation, one observes that giving questionnaires or taking opinions from the users in a web application is often bothersome for the users. Usually the user ignores such questions and even in case where they are inclined to answer, them they do not feel comfortable with such applications. Keeping this in mind, a new scheme for recording user responses has been introduced in this paper through which, whenever a user clicks an image from the retrieved images, a record is maintained automatically for the query and the selected images. Based on these records, the system can be retuned according to user experiences. Moreover, when a user clicks an image, it is considered to be positive image example; all images from this category are presented to the user.
3.6 Image Indexing
Image indexing refers to the process of searching and storing images into the database in such an effective way that it makes the matching and retrieval process fast and easy. In the field of content based image retrieval, images are stored in the form of features which are based on digits. A number of indexing schemes have been introduced in literature to make this process fast and efficient. In the proposed work, the features have been stored in the form of vectors. A sample features storage data file has been shown in Table 2.
Table 2.Data File for Tracking the Image Information
As shown in Table 2 above, images have been stored with their respective objects, the number of key points they contain and the codes for objects. In another data file, information about images and the number of objects they contain have been tracked and shown in Table 3. To cope with geometric transformation a 3-bit shift code has also been created and saved in a data file as shown in Table 4.
Table 3.Data File to count the Objects
Table 4.Data File to contain 3-bit Shift code
3.7 Similarity Measure
When a query image is presented to the system, image segmentation is performed to get objects from the image. On these objects, key points are generated with the help of second derivative. From these key points, curve estimation is done with the method of least square polynomial and curve estimation objects are assigned respective codes as explained above. These codes are converted into 3-bit left shift codes. Color information is also obtained from the query image. Hamming distance is then used as similarity measure between the query image and the resident image. Hamming distance is considered most effective for the instant case as it tells about the difference in bits between two strings. Let the query image be denoted by e and resident image be denoted by d. Code is stored in bits form in variables e and d whereas the number of bits in a code depends upon the number of curves or lines in an object. When the string length does not match, it is assumed that objects are entirely different and a measure of the difference is calculated. Hamming distance can be computed by:
This difference is also calculated from the left shifted bits data file to make sure that a rotated form of object is not missed. Once the difference between objects is calculated, it is summed up for each resident image. For example, if image1 has three objects and query image has two objects, the difference between objects will be calculated one by one as: (img1,obj1-imgq,obj1), (img1,obj2-imgq,obj1), (img1,obj3-imgq,obj1), (img1, obj1-imgq,obj2), (img1,obj2-imgq,obj2) and (img1,obj3-imgq,obj2). This means that the difference calculated for each image will be the number of times getting that image by multiplying the number of objects of both images. Experiments have shown that an image should be dropped if the sum of this difference is more than 30% of the number and if it is less than 40% the image is termed as relevant image. This can be summarized by the following expressions:
3.8 Invariance to Geometric Transformations
The proposed method fully copes with the geometric transformations like rotation, scaling and translation. Rotation is avoided by left shift codes and it also ensures the presence of rotated versions of images in the database. Each query image is also compared with the rotated versions to get the matched image. Moreover, a shape has the same slope points despite difference in size. For example, a small circle and a large circle have the same slope points; only the size of curves is different which is of no effect because of similar codes. Translation is coped with image segmentation. Wherever an object in the image lies, it detects the object.
4. Experimental Results and Analysis
In this section the proposed method is tested for its retrieval performance. Different offline experiments are performed to check the strength of the method. System specification, quantification measures, experiments and comparison with some state of art methods are discussed in detail in the following sub-sections.
4.1 System Specification
System details have been given in Table 5. The dataset used is core database [13,14]. It contains 10,000 low resolution images of size 128x85 and which have been divided into 100 categories, with 100 images in each category. The images also include rotated images to check the proposed method’s flexibility against geometric transformations. Categories include both single object images and composite images with multiple objects.
Table 5.System Details
4.2 Quantification Measures
Widely used and accepted precision and recall rates have been used to evaluate the performance of proposed method. Let the total relevant retrieved images are denoted by O, total images to be retrieved by P and total images of a category by Q. Precision and Recall can be expressed by the following equations:
Precision measures the ratio of relevant images to the images intended to be retrieved. For example, system is intended to retrieve 50 images per search click and relevant images in these 50 images. A recall measures the ratio between relevant retrieved images to total images in the category to which a query image belongs, for example 100 images in each category and in relevant retrieved images. P and Q are pre-defined whereas O depends on what the system retrieves at click search.
4.3 Retrieval Results
A query image from “food” category is provided from database to the proposed system and top 15 retrieval results are shown in Fig. 7.
Fig. 7.Retrieval Results: On top Query Image Food, down Top 15 Retrieved Results
One can see from Fig. 7 that how accurately the proposed system retrieves the relevant images. Looking at query image, it is evident that image segmentation boundary extraction produces two objects i.e., inner food that is kept into the plate and the outer circle of plate. Application of the proposed method generates two codes for these objects as well as a color description value. Similarity is then calculated in the manner discussed above as a result of which the images which are closer in color to the query image are sorted at top. Many other round objects exist in the database but we obtain accurately similar objects due to inner object consideration. Therefore, the images with these objects or nearer to these have been selected. In the next experiment an image outside of database will be provided as query image.
A query image of “car” outside the database has been provided to the proposed system and top 15 retrieval results have been shown in Fig. 8.
Fig. 8.Retrieval Results: On top Query Image of “Car”, down Top 15 Retrieved Results
Image is segmented into six different objects and boundaries of these objects are obtained. It can be observed that three objects contain lines, two objects contain curves and one object contains combination of both. These objects are coded according to the proposed method discussed above and compared with already populated values in the database and it is shown how effectively similar images are picked as retrieved images.
One image from each category of Butterfly, Food, Cars, People, Shapes, Sunset, Ocean and Desert is provided as query image and values of precision and recall rates are recorded.
P = images tobe retrieved = 50
Q = tota lim age sin a category = 100
O = Re levant Re trieve dim ages
Table 6.Precision and Recall
Table 6 shows the precision and recall rates for the proposed method which clearly reflects the effectiveness and accuracy of image matching. Experiments have confirmed that images with fewer objects are more accurately matched and retrieved because images with many objects are capable of having more chances to be matched with other images of the same objects. It can also be seen from Table 6 that shapes with fewer objects have 100% precision rate. In the next, experiment, a comparison is made between the proposed method and some commonly used methods.
Comparison of content based image retrieval needs same computing environment and database because comparing the two methods using different environments and database will give meaningless results. Therefore in this experiment some retrieval methods are implemented in the same computing environment and values of average precision are recorded in order to compare the proposed method with them. These methods include color histogram (color feature), color co-occurrence matrix (CCM), HSI color histogram, Color + texture (HSI color histogram, CCM), Color + Texture + Shape and Color + Texture+ Shape + Relevance Feedback. Table 7 and Fig. 9 show the comparison of results numerically and visually.
Table 7.Average Precision Comparison of Different Methods
Fig. 9.Comparison of Average Precision
It is clear from Table 7 and Fig. 9 that our proposed method achieves the highest average precision by using combined features of shape, color and relevance feedback. This is due to the new and self-introduced features. Shape feature has been used as a primary feature and the whole retrieval performance relies on it. Slope points are detected effectively by using second derivative. Least square polynomial estimates the curves effectively. Moreover, we have introduced a generic method of coding the estimated curves or lines which is very useful in getting accurate results. For user perceptual satisfaction, color features have been extracted to provide the nearest results in color as well. Relevance Feedback has been used in a different way. User responses have been recorded without bothering the user. When user clicks an image, all images belonging to that category are presented to the user automatically. This is helpful in getting recall rates up to 100%.
A stand alone image feature results in inaccurate and irrelevant retrieval results. Percentage of system performance increases and better results are achieved when two or more features are combined. Our proposed technique combined self-defined multiple features; shape, color and relevance feedback. To prove the strength of proposed method, other combinations of features were also tested for the same database and same environment as shown in Table 7. These combinations included color, texture, color + texture, color + texture + shape, color + texture + shape + relevance feedback respectively by using popular methods of color histogram and HSI color histogram for color description, edge histogram descriptor for shape, CCM for texture description and relevance feedback. The proposed method combines shape, color and relevance feedback but its uniqueness rests in the fact that it uses the newly defined methods for shape, color and relevance feedback descriptions that follow second derivative for slop points estimation, least square polynomial for curve estimation, coding mechanism, max-min average for color description and a modified relevance feedback. This mechanism performs the retrieval process very accurately and effectively. As the experimental results show, the proposed method achieves the highest percentage for precision and retrieval accuracy as compared to the other similar techniques currently in vogue.
- M. Yasmin and S. Mohsin, "Image Retrieval by Shape and Color Contents and Relevance Feedback," in Proc. of 10th International conference on Frontiers of Information Technology, pp. 282-287, December 17-19, 2012,
- Mussarat Yasmin, Sajjad Mohsin, Isma Irum and Muhammad Sharif, "Content Based Image Retrieval by Shape, Color and Relevance Feedback," Life Sciences Journal, vol. 10, no. 4, pp. 593-598, April, 2013,
- Mussarat Yasmin, Muhammad Sharif, Isma Irum and Sajjad Mohsin, "Powerful Descriptor for Image Retrieval Based on Angle Edge and Histograms," Journal of Applied Research and Technology, vol. 11, no. 6 pp. 727-732,October, 2013,
- Seyoon Jeong, Kyuheon Kim, Byungtae Chun, Jaeyeon Lee and Youglae Bae, "An Effective Method for Combining Multiples Features of Image Retrieval," in Proc. of Tenchon Proceding of the IEEE Region 10 Conference, pp. 982-985, September 15-17, 1999,
- Tian Yumin and Mei Lixia, "Image Retrieval based on Multiple Features using Wavelet," in Proc. of Fifth International Conference on Computational Intelligence and Multimedia Applications, pp. 137-142, September 27-30, 2003,
- Beng Chin Ooi, Heng Tao Shen and Chenyi Xia, "Towards Efficient Image Retrieval based on Multiple Features," in Proc. of the Joint Conference of the Fourth International conference on Information, Communication and Signal Processing, pp. 180-185, December 15-18, 2003,
- Wang Xiangyang, Hu Fengli and Yang Hongying, "A Novel Region-of-Interest based Image Retrieval using Multiple Features," in Proc. of 12th Internation Multi-Media Modelling Conference, 2006,
- Jeong-Yo Ha, Gye-Young Kim and Hyung-Il Choi, "The Content based Image Retrieval Method using Multiple Features," in Proc. of Fourth International Conference on Networked Computing and Advanced Information Management, pp. 652-657, September 2-4, 2008,
- B.Jyothi, Y.Madhave Latha, V.S.K.Reddy, "Relevance Feedback Content based Image Retrieval using Multiple Features," in Proc. of IEEE International Conference on Computational Intelligence and Computing Research, pp. 1-5, December 28-29, 2010,
- R. Priya, Dr. Vasantha Kalyani David, Optimized Content based Image Retrieval System based on Multiple Feature Fusion Algorithm," International Journal of Computer Applications, Vol 31, No. 8, Oct 2011,
- Jun Yu, Dongquan Liu, Dacheng Tao and Hock Soon Seah, "On Combining Multiple Features for Cartoon Character Retrieval and Clip Synthesis," IEEE Transactions on System, Man and Cybernetics-Part B: Cybernetics, vol. 42, no. 5, pp. 1413-1427, October 2012, https://doi.org/10.1109/TSMCB.2012.2192108
- Qianni Zhang and Ebroul Izquierdo, "Histology Image Retrieval in Optimized Multifeature Spaces," IEEE Journal of Biomedical and Health Informatics, vol 17, no. 1, pp. 240-249, January 2013, https://doi.org/10.1109/TITB.2012.2227270
- Jia Li, James Z. Wang, "Automatic linguistic indexing of pictures by a statistical modeling approach,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 9, pp. 1075-1088, September 2003,
- James Z. Wang, Jia Li and Gio Wiederhold, "SIMPLIcity: Semantics-sensitive Integrated Matching for Picture LIbraries,'' IEEE Trans. on Pattern Analysis and Machine Intelligence, vol 23, no.9, pp. 947-963, September 2001, https://doi.org/10.1109/34.955109