Proceedings Volume 10443

Second International Workshop on Pattern Recognition

Xudong Jiang, Masayuki Arai, Guojian Chen
cover
Proceedings Volume 10443

Second International Workshop on Pattern Recognition

Xudong Jiang, Masayuki Arai, Guojian Chen
Purchase the printed version of this volume at proceedings.com or access the digital version at SPIE Digital Library.

Volume Details

Date Published: 22 June 2017
Contents: 9 Sessions, 62 Papers, 0 Presentations
Conference: Second International Workshop on Pattern Recognition 2017
Volume Number: 10443

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 10443
  • Target Recognition and Tracking
  • Face Recognition
  • Image Segmentation
  • Image Transformation and Analysis
  • Medical Image Analysis and Processing
  • Image Processing and Applications
  • Filter Design and Signal Processing
  • Computer Information Theory and Technology
Front Matter: Volume 10443
icon_mobile_dropdown
Front Matter: Volume 10443
This PDF file contains the front matter associated with SPIE Proceedings Volume 10443, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.
Target Recognition and Tracking
icon_mobile_dropdown
Artificial intelligence tools for pattern recognition
Elena Acevedo, Antonio Acevedo, Federico Felipe, et al.
In this work, we present a system for pattern recognition that combines the power of genetic algorithms for solving problems and the efficiency of the morphological associative memories. We use a set of 48 tire prints divided into 8 brands of tires. The images have dimensions of 200 x 200 pixels. We applied Hough transform to obtain lines as main features. The number of lines obtained is 449. The genetic algorithm reduces the number of features to ten suitable lines that give thus the 100% of recognition. Morphological associative memories were used as evaluation function. The selection algorithms were Tournament and Roulette wheel. For reproduction, we applied one-point, two-point and uniform crossover.
Eye movement identification based on accumulated time feature
Baobao Guo, Qiang Wu, Jiande Sun, et al.
Eye movement is a new kind of feature for biometrical recognition, it has many advantages compared with other features such as fingerprint, face, and iris. It is not only a sort of static characteristics, but also a combination of brain activity and muscle behavior, which makes it effective to prevent spoofing attack. In addition, eye movements can be incorporated with faces, iris and other features recorded from the face region into multimode systems. In this paper, we do an exploring study on eye movement identification based on the eye movement datasets provided by Komogortsev et al. in 2011 with different classification methods. The time of saccade and fixation are extracted from the eye movement data as the eye movement features. Furthermore, the performance analysis was conducted on different classification methods such as the BP, RBF, ELMAN and SVM in order to provide a reference to the future research in this field.
Facades structure detection by geometric moment
Diqiong Jiang, Hui Chen, Rui Song, et al.
This paper proposes a novel method for extracting facades structure from real-world pictures by using local geometric moment. Compared with existing methods, the proposed method has advantages of easy-to-implement, low computational cost, and robustness to noises, such as uneven illumination, shadow, and shade from other objects. Besides, our method is faster and has a lower space complexity, making it feasible for mobile devices and the situation where real-time data processing is required. Specifically, a facades structure modal is first proposed to support the use of our special noise reduction method, which is based on a self-adapt local threshold with Gaussian weighted average for image binarization processing and the feature of the facades structure. Next, we divide the picture of the building into many individual areas, each of which represents a door or a window in the picture. Subsequently we calculate the geometric moment and centroid for each individual area, for identifying those collinear ones based on the feature vectors, each of which is thereafter replaced with a line. Finally, we comprehensively analyze all the geometric moment and centroid to find out the facades structure of the building. We compare our result with other methods and especially report the result from the pictures taken in bad environmental conditions. Our system is designed for two application, i.e, the reconstruction of facades based on higher resolution ground-based on imagery, and the positional system based on recognize the urban building.
A natural approach to convey numerical digits using hand activity recognition based on hand shape features
H. Chidananda, T. Hanumantha Reddy
This paper presents a natural representation of numerical digit(s) using hand activity analysis based on number of fingers out stretched for each numerical digit in sequence extracted from a video. The analysis is based on determining a set of six features from a hand image. The most important features used from each frame in a video are the first fingertip from top, palm-line, palm-center, valley points between the fingers exists above the palm-line. Using this work user can convey any number of numerical digits using right or left or both the hands naturally in a video. Each numerical digit ranges from 0 to9. Hands (right/left/both) used to convey digits can be recognized accurately using the valley points and with this recognition whether the user is a right / left handed person in practice can be analyzed. In this work, first the hand(s) and face parts are detected by using YCbCr color space and face part is removed by using ellipse based method. Then, the hand(s) are analyzed to recognize the activity that represents a series of numerical digits in a video. This work uses pixel continuity algorithm using 2D coordinate geometry system and does not use regular use of calculus, contours, convex hull and datasets.
Multiclass multiple kernel learning for HRRP-based radar target recognition
Yu Guo, Huaitie Xiao, Hongqi Fan, et al.
A novel machine learning method named multiclass multiple kernel learning based on support vector data description with negative (MMKL-NSVDD) is developed to classify the FFT-magnitude feature of complex high-resolution range profile (HRRP), motivated by the problem of radar automatic target recognition (RATR). The proposed method not only inherits the close nonlinear boundary advantage of SVDD-neg model, which is applied with no assumptions regarding to the distribution of data and prior information, but also incorporates multiple kernel into the mode, avoiding fussy choice of kernel parameters and fusing multiple kernel information. Hence, it leads to a remarkable improvement of recognition rate, demonstrated by experimental results based on HRRPs of four aircrafts. The MMKL-NSVDD is ideal for HRRPBased radar target recognition.
Degraded Chinese rubbing images thresholding based on local first-order statistics
Fang Wang, Ling-Ying Hou, Han Huang
It is a necessary step for Chinese character segmentation from degraded document images in Optical Character Recognizer (OCR); however, it is challenging due to various kinds of noising in such an image. In this paper, we present three local first-order statistics method that had been adaptive thresholding for segmenting text and non-text of Chinese rubbing image. Both visual inspection and numerically investigate for the segmentation results of rubbing image had been obtained. In experiments, it obtained better results than classical techniques in the binarization of real Chinese rubbing image and PHIBD 2012 datasets.
Towards discrete wavelet transform-based human activity recognition
Manish Khare, Moongu Jeon
Providing accurate recognition of human activities is a challenging problem for visual surveillance applications. In this paper, we present a simple and efficient algorithm for human activity recognition based on a wavelet transform. We adopt discrete wavelet transform (DWT) coefficients as a feature of human objects to obtain advantages of its multiresolution approach. The proposed method is tested on multiple levels of DWT. Experiments are carried out on different standard action datasets including KTH and i3D Post. The proposed method is compared with other state-of-the-art methods in terms of different quantitative performance measures. The proposed method is found to have better recognition accuracy in comparison to the state-of-the-art methods.
Pattern recognition of concrete surface cracks and defects using integrated image processing algorithms
Jessie R. Balbin, Carlos C. Hortinela IV, Ramon G. Garcia, et al.
Pattern recognition of concrete surface crack defects is very important in determining stability of structure like building, roads or bridges. Surface crack is one of the subjects in inspection, diagnosis, and maintenance as well as life prediction for the safety of the structures. Traditionally determining defects and cracks on concrete surfaces are done manually by inspection. Moreover, any internal defects on the concrete would require destructive testing for detection. The researchers created an automated surface crack detection for concrete using image processing techniques including Hough transform, LoG weighted, Dilation, Grayscale, Canny Edge Detection and Haar Wavelet Transform. An automatic surface crack detection robot is designed to capture the concrete surface by sectoring method. Surface crack classification was done with the use of Haar trained cascade object detector that uses both positive samples and negative samples which proved that it is possible to effectively identify the surface crack defects.
Recognizing Chinese characters in digital ink from non-native language writers using hierarchical models
While Chinese is learned as a second language, its characters are taught step by step from their strokes to components, radicals to components, and their complex relations. Chinese Characters in digital ink from non-native language writers are deformed seriously, thus the global recognition approaches are poorer. So a progressive approach from bottom to top is presented based on hierarchical models. Hierarchical information includes strokes and hierarchical components. Each Chinese character is modeled as a hierarchical tree. Strokes in one Chinese characters in digital ink are classified with Hidden Markov Models and concatenated to the stroke symbol sequence. And then the structure of components in one ink character is extracted. According to the extraction result and the stroke symbol sequence, candidate characters are traversed and scored. Finally, the recognition candidate results are listed by descending. The method of this paper is validated by testing 19815 copies of the handwriting Chinese characters written by foreign students.
Fault prevention by early stage symptoms detection for automatic vehicle transmission using pattern recognition and curve fitting
Jessie R. Balbin, Febus Reidj G. Cruz, Jon Ervin A. Abu, et al.
Automobiles have become essential parts of our everyday lives. It can correlate many factors that may affect a vehicle primarily those which may inconvenient or in some cases harm lives or properties. Thus, focusing on detecting an automatic transmission vehicle engine, body and other parts that cause vibration and sound may help prevent car problems using MATLAB. By using sound, vibration, and temperature sensors to detect the defects of the car and with the help of the transmitter and receiver to gather data wirelessly, it is easy to install on to the vehicle. A technique utilized from Toyota Balintawak Philippines that every car is treated as panels(a, b, c, d, and e) 'a' being from the hood until the front wheel of the car and 'e' the rear shield to the back of the car, this was applied on how to properly place the sensors so that precise data could be gathered. Data gathered would be compared to the normal graph taken from the normal status or performance of a vehicle, data that would surpass 50% of the normal graph would be considered that a problem has occurred. The system is designed to prevent car accidents by determining the current status or performance of the vehicle, also keeping people away from harm.
Improved convolutional networks in forest species identification task
Kar Fai Siew, Xin Jie Tang, Yong Haur Tay
Forest species identification is a special case of texture classification problem that can be solved with hand-crafted features. Convolutional Networks (ConvNet) is able to learn features adaptively and it has achieved impressive result in complicated recognition tasks. This paper presents an improvement to ConvNet-based approach in1 for forest species identification. Due to the small amount of training data, we proposed the addition of dropout layer to ConvNet architecture and data augmentation to increase the size of training data. New classification process of combining the ConvNet outputs of each image patches is proposed. Our improved ConvNet-based method has achieved promising results.
Research and implementation of finger-vein recognition algorithm
Zengyao Pang , Jie Yang, Yilei Chen, et al.
In finger vein image preprocessing, finger angle correction and ROI extraction are important parts of the system. In this paper, we propose an angle correction algorithm based on the centroid of the vein image, and extract the ROI region according to the bidirectional gray projection method. Inspired by the fact that features in those vein areas have similar appearance as valleys, a novel method was proposed to extract center and width of palm vein based on multi-directional gradients, which is easy-computing, quick and stable. On this basis, an encoding method was designed to determine the gray value distribution of texture image. This algorithm could effectively overcome the edge of the texture extraction error. Finally, the system was equipped with higher robustness and recognition accuracy by utilizing fuzzy threshold determination and global gray value matching algorithm. Experimental results on pairs of matched palm images show that, the proposed method has a EER with 3.21% extracts features at the speed of 27ms per image. It can be concluded that the proposed algorithm has obvious advantages in grain extraction efficiency, matching accuracy and algorithm efficiency.
Comparison expert and novice scan behavior for using e-learning
Felisia Novita Sari, Paulus Insap Santosa, Sunu Wibirama
E-Learning is an important media that an educational institution must have. Successful information design for e-learning depends on its user’s characteristics. This study explores differences between novice and expert users’ eye movement data. This differences between expert and novice users were compared and identified based on gaze features. Each participant must do three main tasks of e-learning. This paper gives the result that there are differences between gaze features of experts and novices.
Driver face tracking using semantics-based feature of eyes on single FPGA
Ying-Hao Yu, Ji-An Chen, Yi-Siang Ting, et al.
Tracking driver’s face is one of the essentialities for driving safety control. This kind of system is usually designed with complicated algorithms to recognize driver’s face by means of powerful computers. The design problem is not only about detecting rate but also from parts damages under rigorous environments by vibration, heat, and humidity. A feasible strategy to counteract these damages is to integrate entire system into a single chip in order to achieve minimum installation dimension, weight, power consumption, and exposure to air. Meanwhile, an extraordinary methodology is also indispensable to overcome the dilemma of low-computing capability and real-time performance on a low-end chip. In this paper, a novel driver face tracking system is proposed by employing semantics-based vague image representation (SVIR) for minimum hardware resource usages on a FPGA, and the real-time performance is also guaranteed at the same time. Our experimental results have indicated that the proposed face tracking system is viable and promising for the smart car design in the future.
Face Recognition
icon_mobile_dropdown
Finessing filter scarcity problem in face recognition via multi-fold filter convolution
Cheng-Yaw Low, Andrew Beng-Jin Teoh
The deep convolutional neural networks for face recognition, from DeepFace to the recent FaceNet, demand a sufficiently large volume of filters for feature extraction, in addition to being deep. The shallow filter-bank approaches, e.g., principal component analysis network (PCANet), binarized statistical image features (BSIF), and other analogous variants, endure the filter scarcity problem that not all PCA and ICA filters available are discriminative to abstract noise-free features. This paper extends our previous work on multi-fold filter convolution (-FFC), where the pre-learned PCA and ICA filter sets are exponentially diversified by folds to instantiate PCA, ICA, and PCA-ICA offspring. The experimental results unveil that the 2-FFC operation solves the filter scarcity state. The 2-FFC descriptors are also evidenced to be superior to that of PCANet, BSIF, and other face descriptors, in terms of rank-1 identification rate (%).
Method for secure electronic voting system: face recognition based approach
M. Affan Alim, Misbah M. Baig, Shahzain Mehboob, et al.
In this paper, we propose a framework for low cost secure electronic voting system based on face recognition. Essentially Local Binary Pattern (LBP) is used for face feature characterization in texture format followed by chi-square distribution is used for image classification. Two parallel systems are developed based on smart phone and web applications for face learning and verification modules. The proposed system has two tire security levels by using person ID followed by face verification. Essentially class specific threshold is associated for controlling the security level of face verification. Our system is evaluated three standard databases and one real home based database and achieve the satisfactory recognition accuracies. Consequently our propose system provides secure, hassle free voting system and less intrusive compare with other biometrics.
Video-based face recognition via convolutional neural networks
Tianlong Bao, Chunhui Ding, Saleem Karmoshi, et al.
Face recognition has been widely studied recently while video-based face recognition still remains a challenging task because of the low quality and large intra-class variation of video captured face images. In this paper, we focus on two scenarios of video-based face recognition: 1)Still-to-Video(S2V) face recognition, i.e., querying a still face image against a gallery of video sequences; 2)Video-to-Still(V2S) face recognition, in contrast to S2V scenario. A novel method was proposed in this paper to transfer still and video face images to an Euclidean space by a carefully designed convolutional neural network, then Euclidean metrics are used to measure the distance between still and video images. Identities of still and video images that group as pairs are used as supervision. In the training stage, a joint loss function that measures the Euclidean distance between the predicted features of training pairs and expanding vectors of still images is optimized to minimize the intra-class variation while the inter-class variation is guaranteed due to the large margin of still images. Transferred features are finally learned via the designed convolutional neural network. Experiments are performed on COX face dataset. Experimental results show that our method achieves reliable performance compared with other state-of-the-art methods.
Application of OpenCV in Asus Tinker Board for face recognition
Wei-Yu Chen , Frank Wu, Chung-Chiang Hu
The rise of the Internet of Things to promote the development of technology development board, the processor speed of operation and memory capacity increases, more and more applications, can already be completed before the data on the board computing, combined with the network to sort the information after Sent to the cloud for processing, so that the front of the development board is no longer simply retrieve the data device. This study uses Asus Tinker Board to install OpenCV for real-time face recognition and capture of the face, the acquired face to the Microsoft Cognitive Service cloud database for artificial intelligence comparison, to find out what the face now represents the mood. The face of the corresponding person name, and finally, and then through the text of Speech to read the name of the name to complete the identification of the action. This study was developed using the Asus Tinker Board, which uses ARM-based CPUs with high efficiency and low power consumption, plus improvements in memory and hardware performance for the development board.
Multimodal recognition based on face and ear using local feature
The pose issue which may cause loss of useful information has always been a bottleneck in face and ear recognition. To address this problem, we propose a multimodal recognition approach based on face and ear using local feature, which is robust to large facial pose variations in the unconstrained scene. Deep learning method is used for facial pose estimation, and the method of a well-trained Faster R-CNN is used to detect and segment the region of face and ear. Then we propose a weighted region-based recognition method to deal with the local feature. The proposed method has achieved state-of-the-art recognition performance especially when the images are affected by pose variations and random occlusion in unconstrained scene.
Upright detection of in-plain rotated face images with complicated background for organizing photos
Digital cameras and smart-phones with orientation sensors allow auto-rotation of portrait images. Auto-rotation of portrait is done by using the image file's metadata, exchangeable image file format (EXIF). The output of these sensors is used to set the EXIF orientation flag to reflect the positioning of the camera with respect to the ground. Unfortunately, software program support for this feature is not widespread or consistently applied. Our research goal is to create the EXIF orientation flag by detecting the upright direction of face images having no orientation flag and is to apply the software of organizing photos. In this paper, we propose a novel upright detection scheme for face images that relies on generation of rotated images in four direction and part-based face detection with Haar-like features. Inputted images are frontal faces and these images are in-plain rotated in four possible direction. The process of upright detection is that among four possible rotated images, if only one rotated image is accepted in face detection and other three rotated images are rejected, the upright direction is obtained from the accepted direction. Rotation angle of EXIF orientation is, 0 degree, 90 degree clockwise, 90 degree counter-clockwise, or 180 degree. Experimental results on 450 face image samples show that proposed method is very effective in detecting upright of face images with background variations.
Face pose tracking using the four-point algorithm
Ho Yin Fung, Kin Hong Wong, Ying Kin Yu, et al.
In this paper, we have developed an algorithm to track the pose of a human face robustly and efficiently. Face pose estimation is very useful in many applications such as building virtual reality systems and creating an alternative input method for the disabled. Firstly, we have modified a face detection toolbox called DLib for the detection of a face in front of a camera. The detected face features are passed to a pose estimation method, known as the four-point algorithm, for pose computation. The theory applied and the technical problems encountered during system development are discussed in the paper. It is demonstrated that the system is able to track the pose of a face in real time using a consumer grade laptop computer.
Image Segmentation
icon_mobile_dropdown
Multiple images segmentation based on saliency map
XiaoLan Ning, Cheng Xu, SiQi Li, et al.
Aiming at discovering and segmenting out common objects from multiple images, co-segmentation is a effective method. It is more accurate to make full use of the relationships between images in segmenting than only single image. The first step is to deal with single image with employing hierarchical segmentation to get a Contour Map, saliency detection to obtain the saliency map and object detection to find the possible common part. Then, constructing a digraph with the multiple local regions, and dealing with the digraph. When a digraph is constructed, the corresponding between adjacent two images is influential to the co-segmentation results. This paper develops a method to sort the images to co-segment. Also, we test the method on ICOSEG and MSRC datasets, and compare it with four proposed method. And the results show that it is efficient in co-segmentation with higher precision than many existing and conventional co-segmentation methods.
Carotid artery B-mode ultrasound image segmentation based on morphology, geometry and gradient direction
I. Made Gede Sunarya, Eko Mulyanto Yuniarno, Mauridhi Hery Purnomo, et al.
Carotid Artery (CA) is one of the vital organs in the human body. CA features that can be used are position, size and volume. Position feature can used to determine the preliminary initialization of the tracking. Examination of the CA features can use Ultrasound. Ultrasound imaging can be operated dependently by an skilled operator, hence there could be some differences in the images result obtained by two or more different operators. This can affect the process of determining of CA. To reduce the level of subjectivity among operators, it can determine the position of the CA automatically. In this study, the proposed method is to segment CA in B-Mode Ultrasound Image based on morphology, geometry and gradient direction. This study consists of three steps, the data collection, preprocessing and artery segmentation. The data used in this study were taken directly by the researchers and taken from the Brno university's signal processing lab database. Each data set contains 100 carotid artery B-Mode ultrasound image. Artery is modeled using ellipse with center c, major axis a and minor axis b. The proposed method has a high value on each data set, 97% (data set 1), 73 % (data set 2), 87% (data set 3). This segmentation results will then be used in the process of tracking the CA.
Automatic airline baggage counting using 3D image segmentation
The baggage number needs to be checked automatically during baggage self-check-in. A fast airline baggage counting method is proposed in this paper using image segmentation based on height map which is projected by scanned baggage 3D point cloud. There is height drop in actual edge of baggage so that it can be detected by the edge detection operator. And then closed edge chains are formed from edge lines that is linked by morphological processing. Finally, the number of connected regions segmented by closed chains is taken as the baggage number. Multi-bag experiment that is performed on the condition of different placement modes proves the validity of the method.
Automated segmentation and isolation of touching cell nuclei in cytopathology smear images of pleural effusion using distance transform watershed method
Khin Yadanar Win, Somsak Choomchuay, Kazuhiko Hamamoto
The automated segmentation of cell nuclei is an essential stage in the quantitative image analysis of cell nuclei extracted from smear cytology images of pleural fluid. Cell nuclei can indicate cancer as the characteristics of cell nuclei are associated with cells proliferation and malignancy in term of size, shape and the stained color. Nevertheless, automatic nuclei segmentation has remained challenging due to the artifacts caused by slide preparation, nuclei heterogeneity such as the poor contrast, inconsistent stained color, the cells variation, and cells overlapping. In this paper, we proposed a watershed-based method that is capable to segment the nuclei of the variety of cells from cytology pleural fluid smear images. Firstly, the original image is preprocessed by converting into the grayscale image and enhancing by adjusting and equalizing the intensity using histogram equalization. Next, the cell nuclei are segmented using OTSU thresholding as the binary image. The undesirable artifacts are eliminated using morphological operations. Finally, the distance transform based watershed method is applied to isolate the touching and overlapping cell nuclei. The proposed method is tested with 25 Papanicolaou (Pap) stained pleural fluid images. The accuracy of our proposed method is 92%. The method is relatively simple, and the results are very promising.
Image Transformation and Analysis
icon_mobile_dropdown
Long-range correlation and wavelet transform analysis of solar magnetic activity
Statistical data processing has been one of the most important activities in many fields of scientific studies, and has become the only way through which one can deal with the underlying processes of the given phenomenon. The two classical techniques to solar time series analysis are related to the space domain and the spectral. In the present paper, the relative phase relationship of sunspot unit area on both hemispheres is investigated by the long-range correlation and the wavelet transform analysis. It is found that, (1) the north-south asynchrony of sunspot unit area can not be regarded as a stochastic phenomenon because its behavior exhibits a long-term tendency; (2) The leading hemisphere of sunspot unit area is the southern hemisphere before the year of 1962, and then the northern hemisphere till the year of 2008; (3) the sunspot unit area should be used to represent the long-term solar magnetic activity. Our analysis results could be instructive to put further research on the physical mechanisms of north-south asynchrony of magnetic activity on the Sun. Moreover, the long-range correlation analysis and the wavelet transform technique of solar time series provide crucial information to understand, describe, and predict long-term solar variability.
Research on image registration based on D-Nets
Cengceng Wu, Zhaoguang Liu, Hongtan Cheng
Image registration is the key technology of digital imaging applications, it is used widely. We researched the image registration techniques in this paper. Based on the basis of D-Nets image registration algorithms, we propose a new innovation. We turn first to process image, so we can get synthetic images of original images and enhanced images. Then we extract SIFT feature in the original image. Next, in order to reduce noise of the image, we use the Gauss filter to process the synthesized image. Then we do experiments with synthetic images in the process of image registration. In this process, we use the D-Nets algorithm to achieve. Compared to the existing method, it can greatly improve the accuracy and recall.
Blind technique using blocking artifacts and entropy of histograms for image tampering detection
Manu V. T., B. M. Mehtre
The tremendous technological advancements in recent times has enabled people to create, edit and circulate images easily than ever before. As a result of this, ensuring the integrity and authenticity of the images has become challenging. Malicious editing of images to deceive the viewer is referred to as image tampering. A widely used image tampering technique is image splicing or compositing, in which regions from different images are copied and pasted. In this paper, we propose a tamper detection method utilizing the blocking and blur artifacts which are the footprints of splicing. The classification of images as tampered or not, is done based on the standard deviations of the entropy histograms and block discrete cosine transformations. We can detect the exact boundaries of the tampered area in the image, if the image is classified as tampered. Experimental results on publicly available image tampering datasets show that the proposed method outperforms the existing methods in terms of accuracy.
Smart mapping for quick detection of dissimilar binary images
In previous work, a probabilistic image matching model for binary images was developed that predicts the number of mappings required to detect dissimilarity between any pair of binary images based on the amount of similarity between them. The model showed that dissimilarity can be detected quickly by randomly comparing corresponding points between two binary images. In this paper, we improve on this quickness for images that have dissimilarity concentrated near their centers. We apply smart mapping schemes to different image sets and analyze the results to show the effectiveness of this mapping scheme for images that have dissimilarity concentrated near their center. We compare three different smart mapping schemes with three different mapping densities to find the best mapping / best density performance.
Content-based image retrieval using scale invariant feature transform and gray level co-occurrence matrix
The rapid growth of different types of images has posed a great challenge to the scientific fraternity. As the images are increasing everyday, it is becoming a challenging task to organize the images for efficient and easy access. The field of image retrieval attempts to solve this problem through various techniques. This paper proposes a novel technique of image retrieval by combining Scale Invariant Feature Transform (SIFT) and Co-occurrence matrix. For construction of feature vector, SIFT descriptors of gray scale images are computed and normalized using z-score normalization followed by construction of Gray-Level Co-occurrence Matrix (GLCM) of normalized SIFT keypoints. The constructed feature vector is matched with those of images in database to retrieve visually similar images. The proposed method is tested on Corel-1K dataset and the performance is measured in terms of precision and recall. The experimental results demonstrate that the proposed method outperforms some of the other state-of-the-art methods.
Comparing the performance of different ultrasonic images enhancement for speckle noise reduction in ultrasound images using techniques: a preference study
Md. Shohel Rana, Kaushik Sarker, Touhid Bhuiyan, et al.
Diagnostic ultrasound (US) is an important tool in today's sophisticated medical diagnostics. Nearly every medical discipline benefits itself from this relatively inexpensive method that provides a view of the inner organs of the human body without exposing the patient to any harmful radiations. Medical diagnostic images are usually corrupted by noise during their acquisition and most of the noise is speckle noise. To solve this problem, instead of using adaptive filters which are widely used, No-Local Means based filters have been used to de-noise the images. Ultrasound images of four organs such as Abdomen, Ortho, Liver, Kidney, Brest and Prostrate of a Human body have been used and applied comparative analysis study to find out the output. These images were taken from Siemens SONOLINE G60 S System and the output was compared by matrices like SNR, RMSE, PSNR IMGQ and SSIM. The significance and compared results were shown in a tabular format.
Action description using point clouds
Wenping Liu, Yongfeng Jiang, Haili Wang, et al.
An action description method named as Motion History Point Cloud (MHPC) is proposed in this paper. MHPC compresses an action into a three-dimensional point cloud in which depth information is required. In MHPC, the spatial coordinate channels are used to record the motion foreground, and the color channels are used to record the temporal variation. Due to containing depth information, MHPC can depict an action more meticulous than Motion History Image (MHI). MHPC can serve as a pre-processed input for various classification methods, such as Bag of Words and Deep Learning. An action recognition scheme is provided as an application example of MHPC. In this scheme, Harris3D detector and Fast Point Feature Histogram (FPFH) are used to extract and describe features from MHPC. Then, Bag of Words and multiple classification Support Vector Machine (SVM) are used to do action recognition. The experiments show that rich features can be extracted from MHPC to support the subsequent action recognition even after downsampling. The feasibility and effectiveness of MHPC are also verified by comparing the above scheme with two similar methods.
A new non-uniformity correction method based on unidirectional variational model
Jing Hu, Fan Liu, Liuting Yan
Scanning infrared imaging system often suffers from stripe non-uniformity. Considering the geometric characteristic of stripe non-uniformity in scanning images, the gradient of pixels cross scanning direction is much more than that in scanning direction, and the latter is more similar to the real scene. The reason for the above phenomenon is that pixels in scanning direction have uniformity parameters and those cross scanning direction have non-uniformity parameters. Therefore, a homogenization method based on a unidirectional variation model is proposed in this paper. The unidirectional variation model can minimize the gradient cross scanning direction. And the homogenization method is used to preserve the edge and detailed information in scanning direction. Experimental results demonstrate a good performance of our proposed method for stripe non-uniformity images.
Medical Image Analysis and Processing
icon_mobile_dropdown
SVM-based automatic diagnosis method for keratoconus
Yuhong Gao, Qiang Wu, Jing Li, et al.
Keratoconus is a progressive cornea disease that can lead to serious myopia and astigmatism, or even to corneal transplantation, if it becomes worse. The early detection of keratoconus is extremely important to know and control its condition. In this paper, we propose an automatic diagnosis algorithm for keratoconus to discriminate the normal eyes and keratoconus ones. We select the parameters obtained by Oculyzer as the feature of cornea, which characterize the cornea both directly and indirectly. In our experiment, 289 normal cases and 128 keratoconus cases are divided into training and test sets respectively. Far better than other kernels, the linear kernel of SVM has sensitivity of 94.94% and specificity of 97.87% with all the parameters training in the model. In single parameter experiment of linear kernel, elevation with 92.03% sensitivity and 98.61% specificity and thickness with 97.28% sensitivity and 97.82% specificity showed their good classification abilities. Combining elevation and thickness of the cornea, the proposed method can reach 97.43% sensitivity and 99.19% specificity. The experiments demonstrate that the proposed automatic diagnosis method is feasible and reliable.
Automatic bone outer contour extraction from B-modes ultrasound images based on local phase symmetry and quadratic polynomial fitting
Tita Karlita, Eko Mulyanto Yuniarno, I. Ketut Eddy Purnama, et al.
Analyzing ultrasound (US) images to get the shapes and structures of particular anatomical regions is an interesting field of study since US imaging is a non-invasive method to capture internal structures of a human body. However, bone segmentation of US images is still challenging because it is strongly influenced by speckle noises and it has poor image quality. This paper proposes a combination of local phase symmetry and quadratic polynomial fitting methods to extract bone outer contour (BOC) from two dimensional (2D) B-modes US image as initial steps of three-dimensional (3D) bone surface reconstruction. By using local phase symmetry, the bone is initially extracted from US images. BOC is then extracted by scanning one pixel on the bone boundary in each column of the US images using first phase features searching method. Quadratic polynomial fitting is utilized to refine and estimate the pixel location that fails to be detected during the extraction process. Hole filling method is then applied by utilize the polynomial coefficients to fill the gaps with new pixel. The proposed method is able to estimate the new pixel position and ensures smoothness and continuity of the contour path. Evaluations are done using cow and goat bones by comparing the resulted BOCs with the contours produced by manual segmentation and contours produced by canny edge detection. The evaluation shows that our proposed methods produces an excellent result with average MSE before and after hole filling at the value of 0.65.
Facial fluid synthesis for assessment of acne vulgaris using luminescent visualization system through optical imaging and integration of fluorescent imaging system
Jessie R. Balbin, Jennifer C. Dela Cruz, Clarisse O. Camba, et al.
Acne vulgaris, commonly called as acne, is a skin problem that occurs when oil and dead skin cells clog up in a person’s pores. This is because hormones change which makes the skin oilier. The problem is people really do not know the real assessment of sensitivity of their skin in terms of fluid development on their faces that tends to develop acne vulgaris, thus having more complications. This research aims to assess Acne Vulgaris using luminescent visualization system through optical imaging and integration of image processing algorithms. Specifically, this research aims to design a prototype for facial fluid analysis using luminescent visualization system through optical imaging and integration of fluorescent imaging system, and to classify different facial fluids present in each person. Throughout the process, some structures and layers of the face will be excluded, leaving only a mapped facial structure with acne regions. Facial fluid regions are distinguished from the acne region as they are characterized differently.
Experiments on automatic classification of tissue malignancy in the field of digital pathology
J. Pereira, R. Barata, Pedro Furtado
Automated analysis of histological images helps diagnose and further classify breast cancer. Totally automated approaches can be used to pinpoint images for further analysis by the medical doctor. But tissue images are especially challenging for either manual or automated approaches, due to mixed patterns and textures, where malignant regions are sometimes difficult to detect unless they are in very advanced stages. Some of the major challenges are related to irregular and very diffuse patterns, as well as difficulty to define winning features and classifier models. Although it is also hard to segment correctly into regions, due to the diffuse nature, it is still crucial to take low-level features over individualized regions instead of the whole image, and to select those with the best outcomes. In this paper we report on our experiments building a region classifier with a simple subspace division and a feature selection model that improves results over image-wide and/or limited feature sets. Experimental results show modest accuracy for a set of classifiers applied over the whole image, while the conjunction of image division, per-region low-level extraction of features and selection of features, together with the use of a neural network classifier achieved the best levels of accuracy for the dataset and settings we used in the experiments. Future work involves deep learning techniques, adding structures semantics and embedding the approach as a tumor finding helper in a practical Medical Imaging Application.
Image Processing and Applications
icon_mobile_dropdown
Color vision deficiency compensation for Visual Processing Disorder using Hardy-Rand-Rittler test and color transformation
Jessie R. Balbin, Jasmine Nadja J. Pinugu, Joshua Ian C. Bautista, et al.
Visual processing skill is used to gather visual information from environment however, there are cases that Visual Processing Disorder (VPD) occurs. The so called visual figure-ground discrimination is a type of VPD where color is one of the factors that contributes on this type. In line with this, color plays a vital role in everyday living, but individuals that have limited and inaccurate color perception suffers from Color Vision Deficiency (CVD) and still not aware on their case. To resolve this case, this study focuses on the design of KULAY, a Head-Mounted Display (HMD) device that can assess whether a user has a CVD or not thru the standard Hardy-Rand-Rittler (HRR) test. This test uses pattern recognition in order to evaluate the user. In addition, color vision deficiency simulation and color correction thru color transformation is also a concern of this research. This will enable people with normal color vision to know how color vision deficient perceives and vice-versa. For the accuracy of the simulated HRR assessment, its results were validated thru an actual assessment done by a doctor. Moreover, for the preciseness of color transformation, Structural Similarity Index Method (SSIM) was used to compare the simulated CVD images and the color corrected images to other reference sources. The output of the simulated HRR assessment and color transformation shows very promising results indicating effectiveness and efficiency of the study. Thus, due to its form factor and portability, this device is beneficial in the field of medicine and technology.
Profiling and sorting Mangifera Indica morphology for quality attributes and grade standards using integrated image processing algorithms
Jessie R. Balbin, Janette C. Fausto, John Michael M. Janabajab, et al.
Mango production is highly vital in the Philippines. It is very essential in the food industry as it is being used in markets and restaurants daily. The quality of mangoes can affect the income of a mango farmer, thus incorrect time of harvesting will result to loss of quality mangoes and income. Scientific farming is much needed nowadays together with new gadgets because wastage of mangoes increase annually due to uncouth quality. This research paper focuses on profiling and sorting of Mangifera Indica using image processing techniques and pattern recognition. The image of a mango is captured on a weekly basis from its early stage. In this study, the researchers monitor the growth and color transition of a mango for profiling purposes. Actual dimensions of the mango are determined through image conversion and determination of pixel and RGB values covered through MATLAB. A program is developed to determine the range of the maximum size of a standard ripe mango. Hue, light, saturation (HSL) correction is used in the filtering process to assure the exactness of RGB values of a mango subject. By pattern recognition technique, the program can determine if a mango is standard and ready to be exported.
Progressive 3D shape abstraction via hierarchical CSG tree
Xingyou Chen, Jin Tang, Chenglong Li
A constructive solid geometry(CSG) tree model is proposed to progressively abstract 3D geometric shape of general object from 2D image. Unlike conventional ones, our method applies to general object without the need for massive CAD models, and represents the object shapes in a coarse-to-fine manner that allows users to view temporal shape representations at any time. It stands in a transitional position between 2D image feature and CAD model, benefits from state-of-the-art object detection approaches and better initializes CAD model for finer fitting, estimates 3D shape and pose parameters of object at different levels according to visual perception objective, in a coarse-to-fine manner. Two main contributions are the application of CSG building up procedure into visual perception, and the ability of extending object estimation result into a more flexible and expressive model than 2D/3D primitive shapes. Experimental results demonstrate the feasibility and effectiveness of the proposed approach.
Lane marking detection based on waveform analysis and CNN
Yang Yang Ye, Hou Jin Chen, Xiao Li Hao
Lane markings detection is a very important part of the ADAS to avoid traffic accidents. In order to obtain accurate lane markings, in this work, a novel and efficient algorithm is proposed, which analyses the waveform generated from the road image after inverse perspective mapping (IPM). The algorithm includes two main stages: the first stage uses an image preprocessing including a CNN to reduce the background and enhance the lane markings. The second stage obtains the waveform of the road image and analyzes the waveform to get lanes. The contribution of this work is that we introduce local and global features of the waveform to detect the lane markings. The results indicate the proposed method is robust in detecting and fitting the lane markings.
Interactive QR code beautification with full background image embedding
Lijian Lin, Song Wu, Sijiang Liu, et al.
QR (Quick Response) code is a kind of two dimensional barcode that was first developed in automotive industry. Nowadays, QR code has been widely used in commercial applications like product promotion, mobile payment, product information management, etc. Traditional QR codes in accordance with the international standard are reliable and fast to decode, but are lack of aesthetic appearance to demonstrate visual information to customers. In this work, we present a novel interactive method to generate aesthetic QR code. By given information to be encoded and an image to be decorated as full QR code background, our method accepts interactive user's strokes as hints to remove undesired parts of QR code modules based on the support of QR code error correction mechanism and background color thresholds. Compared to previous approaches, our method follows the intention of the QR code designer, thus can achieve more user pleasant result, while keeping high machine readability.
Gaze inspired subtitle position evaluation for MOOCs videos
Hongli Chen, Mengzhen Yan, Sijiang Liu, et al.
Online educational resources, such as MOOCs, is becoming increasingly popular, especially in higher education field. One most important media type for MOOCs is course video. Besides traditional bottom-position subtitle accompany to the videos, in recent years, researchers try to develop more advanced algorithms to generate speaker-following style subtitles. However, the effectiveness of such subtitle is still unclear. In this paper, we investigate the relationship between subtitle position and the learning effect after watching the video on tablet devices. Inspired with image based human eye tracking technique, this work combines the objective gaze estimation statistics with subjective user study to achieve a convincing conclusion -- speaker-following subtitles are more suitable for online educational videos.
Single image super-resolution based on image patch classification
Ping Xia, Hua Yan, Jing Li, et al.
This paper proposed a single image super-resolution algorithm based on image patch classification and sparse representation where gradient information is used to classify image patches into three different classes in order to reflect the difference between the different types of image patches. Compared with other classification algorithms, gradient information based algorithm is simpler and more effective. In this paper, each class is learned to get a corresponding sub-dictionary. High-resolution image patch can be reconstructed by the dictionary and sparse representation coefficients of corresponding class of image patches. The result of the experiments demonstrated that the proposed algorithm has a better effect compared with the other algorithms.
Training strategy for convolutional neural networks in pedestrian gender classification
Choon-Boon Ng, Yong-Haur Tay, Bok-Min Goi
In this work, we studied a strategy for training a convolutional neural network in pedestrian gender classification with limited amount of labeled training data. Unsupervised learning by k-means clustering on pedestrian images was used to learn the filters to initialize the first layer of the network. As a form of pre-training, supervised learning for the related task of pedestrian classification was performed. Finally, the network was fine-tuned for gender classification. We found that this strategy improved the network’s generalization ability in gender classification, achieving better test results when compared to random weights initialization and slightly more beneficial than merely initializing the first layer filters by unsupervised learning. This shows that unsupervised learning followed by pre-training with pedestrian images is an effective strategy to learn useful features for pedestrian gender classification.
Development of intelligent surveillance system (ISS) in region of interest (ROI) using Kalman filter and camshift on Raspberry Pi 2
Junghun Park, Kicheon Hong
Due to the improvement of the picture quality of closed-circuit television (CCTV), the demand for CCTV has increased rapidly and its market size has also increased. The current system structure of CCTV transfers compressed images without analysis received from CCTV to a control center. The compressed images are suitable for the evidence required for a criminal arrest, but they cannot prevent crime in real time, which has been considered a limitation. Thus, the present paper proposes a system implementation that can prevent crimes by applying a situation awareness system at the back end of the CCTV cameras for image acquisition to prevent crimes efficiently. In the system implemented in the present paper, the region of interest (ROI) is set virtually within the image data when a barrier, such as fence, cannot be installed in actual sites and unauthorized intruders are tracked constantly through data analysis and recognized in the ROI via the developed algorithm. Additionally, a searchlight or alarm sound is activated to prevent crime in real time and the urgent information is transferred to the control center. The system was implemented in the Raspberry Pi 2 board to be run in real time. The experiment results showed that the recognition success rate was 85% or higher and the track accuracy was 90% or higher. By utilizing the system, crime prevention can be achieved by implementing a social safety network.
Video error concealment using block matching and frequency selective extrapolation algorithms
Rajani P. K., Arti Khaparde
Error Concealment (EC) is a technique at the decoder side to hide the transmission errors. It is done by analyzing the spatial or temporal information from available video frames. It is very important to recover distorted video because they are used for various applications such as video-telephone, video-conference, TV, DVD, internet video streaming, video games etc .Retransmission-based and resilient-based methods, are also used for error removal. But these methods add delay and redundant data. So error concealment is the best option for error hiding. In this paper, the error concealment methods such as Block Matching error concealment algorithm is compared with Frequency Selective Extrapolation algorithm. Both the works are based on concealment of manually error video frames as input. The parameter used for objective quality measurement was PSNR (Peak Signal to Noise Ratio) and SSIM(Structural Similarity Index). The original video frames along with error video frames are compared with both the Error concealment algorithms. According to simulation results, Frequency Selective Extrapolation is showing better quality measures such as 48% improved PSNR and 94% increased SSIM than Block Matching Algorithm.
Part-based deep representation for product tagging and search
Despite previous studies, tagging and indexing the product images remain challenging due to the large inner-class variation of the products. In the traditional methods, the quantized hand-crafted features such as SIFTs are extracted as the representation of the product images, which are not discriminative enough to handle the inner-class variation. For discriminative image representation, this paper firstly presents a novel deep convolutional neural networks (DCNNs) architect true pre-trained on a large-scale general image dataset. Compared to the traditional features, our DCNNs representation is of more discriminative power with fewer dimensions. Moreover, we incorporate the part-based model into the framework to overcome the negative effect of bad alignment and cluttered background and hence the descriptive ability of the deep representation is further enhanced. Finally, we collect and contribute a well-labeled shoe image database, i.e., the TBShoes, on which we apply the part-based deep representation for product image tagging and search, respectively. The experimental results highlight the advantages of the proposed part-based deep representation.
The filling-in function of the Bayesian AutoEncoder Network
Kaneharu Nishino, Mary Inaba
We developed the Bayesian AutoEncoder (BAE) to construct a multi-layer restricted Bayesian Network by extracting features from a training dataset. Networks constructed using BAE have hidden variables that represent features of the data and can execute inferences for each feature. In this paper, we show that a network constructed by BAE can not only recognize features but can also fill in lacking data. We performed experiments and confirmed this filling-in ability.
Filter Design and Signal Processing
icon_mobile_dropdown
Almost minimax design of FIR filter using an IRLS algorithm without matrix inversion
Ruijie Zhao, Zhiping Lin, Kar-Ann Toh, et al.
An iterative reweighted least squares (IRLS) algorithm is presented in this paper for the minimax design of FIR filters. In the algorithm, the resulted subproblems generated by the weighted least squares (WLS) are solved by using the conjugate gradient (CG) method instead of the time-consuming matrix inversion method. An almost minimax solution for filter design is consequently obtained. This solution is found to be very efficient compared with most existing algorithms. Moreover, the filtering solution is flexible enough for extension towards a broad range of filter designs, including constrained filters. Two design examples are given and the comparison with other existing algorithms shows the excellent performance of the proposed algorithm.
Real time eye tracking using Kalman extended spatio-temporal context learning
Real time eye tracking has numerous applications in human computer interaction such as a mouse cursor control in a computer system. It is useful for persons with muscular or motion impairments. However, tracking the movement of the eye is complicated by occlusion due to blinking, head movement, screen glare, rapid eye movements, etc. In this work, we present the algorithmic and construction details of a real time eye tracking system. Our proposed system is an extension of Spatio-Temporal context learning through Kalman Filtering. Spatio-Temporal Context Learning offers state of the art accuracy in general object tracking but its performance suffers due to object occlusion. Addition of the Kalman filter allows the proposed method to model the dynamics of the motion of the eye and provide robust eye tracking in cases of occlusion. We demonstrate the effectiveness of this tracking technique by controlling the computer cursor in real time by eye movements.
An improved MTI filter for ground clutter reduction in UAV classification
Fangyuan Wan, Qinglai Liu, Chen Wang, et al.
In recent years, Unmanned Aerial Vehicles (UAVs) have increasingly been used in many civil applications. However, they also pose a significant threat in restricted zones. Radar can be used to detect and discriminate UAVs. Due to the low flying altitude of the UAVs, it is found that the radar signals also include some unwanted echoes, reflected by building, ground, trees and grasses etc. Consequently, it has not been possible to get the clean UAVs characteristics for further classification. In this paper, the MTI filter is applied to cancel the ground clutter and based this, an improved MTI filter is further proposed. Compared with the traditional MTI filter, the improved one significantly enhances ground clutter rejection capability while maintaining most of the target power. As the result, the cleaner UAVs classification characteristics can be obtained. The effectiveness of the proposed method has been verified by an experimental CW radar dataset, collected from a helicopter UAV.
Mining maximal approximate numerical frequent patterns from uncertain data and application for emitter entity resolution
Numerous fuzzy pattern mining methods have been proposed to address the uncertainty and incompleteness of quantitative data. Traditional fuzzy pattern mining methods generally have to transform the original quantitative values into either crystal items or fuzzy regions first, which is hard to apply without comprehensive domain knowledge. In addition, existing numerical pattern mining methods generally suffer high computational cost. Inspired by the above problems, we put forward an efficient maximal approximate numerical frequent pattern mining (MANFPM) method without fuzzy item or region specification. Experimental results have validated its scalability and effectiveness for application in emitter entity resolution.
Assessing effect of meditation on cognitive workload using EEG signals
Narendra Jadhav, Ramchandra Manthalkar, Yashwant Joshi
Recent research suggests that meditation affects the structure and function of the brain. Cognitive load can be handled in effective way by the meditators. EEG signals are used to quantify cognitive load. The research of investigating effect of meditation on cognitive workload using EEG signals in pre and post-meditation is an open problem. The subjects for this study are young healthy 11 engineering students from our institute. The focused attention meditation practice is used for this study. EEG signals are recorded at the beginning of meditation and after four weeks of regular meditation using EMOTIV device. The subjects practiced meditation daily 20 minutes for 4 weeks. The 7 level arithmetic additions of single digit (low level) to three digits with carry (high level) are presented as cognitive load. The cognitive load indices such as arousal index, performance enhancement, neural activity, load index, engagement, and alertness are evaluated in pre and post meditation. The cognitive indices are improved in post meditation data. Power Spectral Density (PSD) feature is compared between pre and post-meditation across all subjects. The result hints that the subjects were handling cognitive load without stress (ease of cognitive functioning increased for the same load) after 4 weeks of meditation.
Cloud storage based mobile assessment facility for patients with post-traumatic stress disorder using integrated signal processing algorithm
Jessie R. Balbin, Jasmine Nadja J. Pinugu, Abigail Joy S. Basco, et al.
The research aims to build a tool in assessing patients for post-traumatic stress disorder or PTSD. The parameters used are heart rate, skin conductivity, and facial gestures. Facial gestures are recorded using OpenFace, an open-source face recognition program that uses facial action units in to track facial movements. Heart rate and skin conductivity is measured through sensors operated using Raspberry Pi. Results are stored in a database for easy and quick access. Databases to be used are uploaded to a cloud platform so that doctors have direct access to the data. This research aims to analyze these parameters and give accurate assessment of the patient.
Analytic radar micro-Doppler signatures classification
Beom-Seok Oh, Zhaoning Gu, Guan Wang, et al.
Due to its capability of capturing the kinematic properties of a target object, radar micro-Doppler signatures (m-DS) play an important role in radar target classification. This is particularly evident from the remarkable number of research papers published every year on m-DS for various applications. However, most of these works rely on the support vector machine (SVM) for target classification. It is well known that training an SVM is computationally expensive due to its nature of search to locate the supporting vectors. In this paper, the classifier learning problem is addressed by a total error rate (TER) minimization where an analytic solution is available. This largely reduces the search time in the learning phase. The analytically obtained TER solution is globally optimal with respect to the classification total error count rate. Moreover, our empirical results show that TER outperforms SVM in terms of classification accuracy and computational efficiency on a five-category radar classification problem.
Computer Information Theory and Technology
icon_mobile_dropdown
Mining strong jumping emerging patterns with a novel list data structure
Xiangtao Chen, Ziping Guan
Strong Jumping Emerging Patterns (SJEPs) are data mining patterns which have strong discriminating abilities in classification. However, SJEPs mining algorithms in current years are usually achieved by the data structure, tree. These existing algorithms using the tree structure are difficult to achieve excellent performance. In this paper, we propose a novel method of mining SJEPs named PPSJEP. This algorithm is based on a novel data structure called NSJEP-list, which is improved from the N-list. We use the NSJEP-lists to replace the tree structure. First, we get the individual items’ NSJEP-lists from the tree. Then we use the intersection of NSJEP-lists to get the longer itemsets’ NSJEP-lists which includes the information of the position and the count in each class. And we mine the SJEPs through the information. Experiments are performed on six UCI datasets. Compared with existing algorithm in running time and classification accuracy, the experimental results show that our algorithm uses less time to mine SJEPs and get the same classification accuracy, especially in lower minimum support threshold.
Study of the similarity function in Indexing-First-One hashing
Y.-L. Lai, Z. Jin, B.-M. Goi, et al.
The recent proposed Indexing-First-One (IFO) hashing is a latest technique that is particularly adopted for eye iris template protection, i.e. IrisCode. However, IFO employs the measure of Jaccard Similarity (JS) initiated from Min-hashing has yet been adequately discussed. In this paper, we explore the nature of JS in binary domain and further propose a mathematical formulation to generalize the usage of JS, which is subsequently verified by using CASIA v3-Interval iris database. Our study reveals that JS applied in IFO hashing is a generalized version in measure two input objects with respect to Min-Hashing where the coefficient of JS is equal to one. With this understanding, IFO hashing can propagate the useful properties of Min-hashing, i.e. similarity preservation, thus favorable for similarity searching or recognition in binary space.
Heterogeneous computing for a real-time pig monitoring system
Younchang Choi, Jinseong Kim, Jaehak Kim, et al.
Video sensor data has been widely used in automatic surveillance applications. In this study, we present a method that automatically detects pigs in a pig room by using depth information obtained from a Kinect sensor. For a real-time implementation, we propose a means of reducing the execution time by applying parallel processing techniques. In general, most parallel processing techniques have been used to parallelize a specific task. In this study, we consider parallelization of an entire system that consists of several tasks. By applying a scheduling strategy to identify a computing device for each task and implementing it with OpenCL, we can reduce the total execution time efficiently. Experimental results reveal that the proposed method can automatically detect pigs using a CPU-GPU hybrid system in real time, regardless of the relative performance between the CPU and GPU.
Deep learning application: rubbish classification with aid of an android device
Sijiang Liu, Bo Jiang, Jie Zhan
Deep learning is a very hot topic currently in pattern recognition and artificial intelligence researches. Aiming at the practical problem that people usually don’t know correct classifications some rubbish should belong to, based on the powerful image classification ability of the deep learning method, we have designed a prototype system to help users to classify kinds of rubbish. Firstly the CaffeNet Model was adopted for our classification network training on the ImageNet dataset, and the trained network was deployed on a web server. Secondly an android app was developed for users to capture images of unclassified rubbish, upload images to the web server for analyzing backstage and retrieve the feedback, so that users can obtain the classification guide by an android device conveniently. Tests on our prototype system of rubbish classification show that: an image of one single type of rubbish with origin shape can be better used to judge its classification, while an image containing kinds of rubbish or rubbish with changed shape may fail to help users to decide rubbish’s classification. However, the system still shows promising auxiliary function for rubbish classification if the network training strategy can be optimized further.
A curriculum-based approach for feature selection
Deepthi Kalavala, Chakravarthy Bhagvati
Curriculum learning is a learning technique in which a classifier learns from easy samples first and then from increasingly difficult samples. On similar lines, a curriculum based feature selection framework is proposed for identifying most useful features in a dataset. Given a dataset, first, easy and difficult samples are identified. In general, the number of easy samples is assumed larger than difficult samples. Then, feature selection is done in two stages. In the first stage a fast feature selection method which gives feature scores is used. Feature scores are then updated incrementally with the set of difficult samples. The existing feature selection methods are not incremental in nature; entire data needs to be used in feature selection. The use of curriculum learning is expected to decrease the time needed for feature selection with classification accuracy comparable to the existing methods. Curriculum learning also allows incremental refinements in feature selection as new training samples become available. Our experiments on a number of standard datasets demonstrate that feature selection is indeed faster without sacrificing classification accuracy.