Proceedings Volume 6973

Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2008

cover
Proceedings Volume 6973

Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2008

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 16 March 2008
Contents: 7 Sessions, 26 Papers, 0 Presentations
Conference: SPIE Defense and Security Symposium 2008
Volume Number: 6973

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 6973
  • Information Assurance and Security
  • Data Mining
  • Intrusion Detection
  • Miscellaneous Topics
  • Miscellaneous Applications
  • Poster Session
Front Matter: Volume 6973
icon_mobile_dropdown
Front Matter: Volume 6973
This PDF file contains the front matter associated with SPIE Proceedings Volume 6973, including the Title Page, Copyright information, Table of Contents, Introduction (if any), and the Conference Committee listing.
Information Assurance and Security
icon_mobile_dropdown
Integrated mandatory access control for digital data
George Hsieh, Gregory Patrick, Keith Foster, et al.
This paper presents an integrated mandatory access control (MAC) framework that incorporates MAC mechanisms at both operating system and application layers for digital data. The framework uses Security-Enhanced Linux (SELinux) as the foundation for MAC at the operating system layer. It uses XACML (eXtensible Access Control Markup Language) as the base mechanism for specifying and embedding information-layer MAC policies. This framework is designed to be general-purpose, flexible, and capable of providing fine-grained access control. This paper also describes a high-level architecture of a prototype being developed for the framework. One targeted application domain for this framework is information sharing and dissemination in a multi-level security environment.
Addressing security issues related to virtual institute distributed activities
Martin R. Stytz, Sheila B. Banks
One issue confounding the development and experimentation of distributed modeling and simulation environments is the inability of the project team to identify and collaborate with resources, both human and technical, from outside the United States. This limitation is especially significant within the human behavior representation area where areas such as cultural effects research and joint command team behavior modeling require the participation of various cultural and national representatives. To address this limitation, as well as other human behavior representation research issues, NATO Research and Technology Organization initiated a project to develop a NATO virtual institute that enables more effective and more collaborative research into human behavior representation. However, in building and operating a virtual institute one of the chief concerns must be the cyber security of the institute. Because the institute "exists" in cyberspace, all of its activities are susceptible to cyberattacks, subterfuge, denial of service and all of the vulnerabilities that networked computers must face. In our opinion, for the concept of virtual institutes to be successful and useful, their operations and services must be protected from the threats in the cyber environment. A key to developing the required protection is the development and promulgation of standards for cyber security. In this paper, we discuss the types of cyber standards that are required, how new internet technologies can be exploited and can benefit the promulgation, development, maintenance, and robustness of the standards. This paper is organized as follows. Section One introduces the concept of the virtual institutes, the expected benefits, and the motivation for our research and for research in this area. Section Two presents background material and a discussion of topics related to VIs, uman behavior and cultural modeling, and network-centric warfare. Section Three contains a discussion of the security challenges that face the virtual institute and the characteristics of the standards that must be employed. Section Four contains our proposal for documentation of the cybersecurity standards. Section Five contains the conclusion and suggestions for further work.
An innovative middle tier design for protecting federal privacy act data
This paper identifies an innovative middle tier technique and design that provides a solid layer of network security for a single source of human resources (HR) data that falls under the Federal Privacy Act. The paper also discusses functionality for both retrieving data and updating data in a secure way. It will be shown that access to this information is limited by a security mechanism that authorizes all connections based on both application (client) and user information.
Mathematical model for security effectiveness figure of merit and its optimization
A new mathematical model for the prediction of the security figure of merit of an assured information system is proposed. The security effectiveness figure of merit is defined as a multi-variate composite function of the strength of security mechanism, usability, performance, and cost. The problem of determining the optimal set of security controls for a given system is then formulated as mathematical optimization problem and the potential methods of approach are addressed. The concept is illustrated with a simple example and the conclusions bring out the benefits of the model.
Distributed intrusion detection system based on grid security model
Grid computing has developed rapidly with the development of network technology and it can solve the problem of large-scale complex computing by sharing large-scale computing resource. In grid environment, we can realize a distributed and load balance intrusion detection system. This paper first discusses the security mechanism in grid computing and the function of PKI/CA in the grid security system, then gives the application of grid computing character in the distributed intrusion detection system (IDS) based on Artificial Immune System. Finally, it gives a distributed intrusion detection system based on grid security system that can reduce the processing delay and assure the detection rates.
Data Mining
icon_mobile_dropdown
Is mining of knowledge possible?
Jim Brander, Alex Lupu
A method for extracting the complete knowledge structure from technical free text is shown, focusing on particular aspects of the process. Extensions to a basic knowledge formalism necessary to allow building of the complete activatable structure from information-rich text are described. The relevance of the extensions to aspects of information mining is covered, including the resources necessary for mining of knowledge structure in minute detail. The paper gives some examples of the cognitive activity required to automatically read and understand text.
Application of data mining to medical risk management
Shusaku Tsumoto, Kimiko Matsuoka, Shigeki Yokoyama
This paper proposes an application of data mining to medical risk management, where data mining techniques were applied to detection, analysis and evaluation of risks potentially existing in clinical environments. We applied this technique to the following two medical domains: risk aversion of nurse incidents and infection control. The results show that data mining methods were effective to detection and aversion of risk factors.
A data mining approach to intelligence operations
Nasrullah Memon, David L. Hicks, Nicholas Harkiolakis
In this paper we examine the latest thinking, approaches and methodologies in use for finding the nuggets of information and subliminal (and perhaps intentionally hidden) patterns and associations that are critical to identify criminal activity and suspects to private and government security agencies. An emphasis in the paper is placed on Social Network Analysis and Investigative Data Mining, and the use of these technologies in the counterterrorism domain. Tools and techniques from both areas are described, along with the important tasks for which they can be used to assist with the investigation and analysis of terrorist organizations. The process of collecting data about these organizations is also considered along with the inherent difficulties that are involved.
The epidemic threshold theorem with social and contact heterogeneity
Doracelly Hincapié Palacio, Juan Ospina Giraldo, Rubén Darío Gómez Arias
The threshold theorem of an epidemic SIR model was compared when infectious and susceptible individuals have homogeneous mixing and heterogeneous social status and when individuals of random networks have contact heterogeneity. Particularly the effect of vaccination in such models is considered when: individuals or nodes are exposed to impoverished, vaccination and loss of immunity. An equilibrium analysis and local stability of small perturbations about the equilibrium values were implemented using computer algebra. Numerical simulations were executed in order to describe the dynamic of transmission of diseases and changes of the basic reproductive rate. The implications of these results are examined around the threats to the global public health security.
Observational study of content of Hg in fog water relative to air pollution in suburbs of Nanjing
Li-li Tang, Sheng-jie Niu, Shuxian Fan, et al.
Using in situ fog water samples gathered in the suburbs of Nanjing city, China in December, 2006, with their analysis performed, an attempt is undertaken of relation of Hg content with air pollution. It is found that foggy weather is unfavorable for diffusing pollutants, resulting in the increase in concentrations of PM10, CO and total hydrocarbon, followed by their drop, and the density of pollutants changes roughly in phase with fog genesis/lysis; posterior to fog dispersal the concentration of SO2, PM10 and NOx is 2.5 to 10 times as high as prior to fog. Hg concentration ranges over 2.965 to 7.205μg/L, averaging 5.471μg/L, the high value appearing in the fog maintenance. Correlation analysis is made of Hg with pollutants, reaching the coefficient of 0.939 between Hg and CO that accounts for their homology.
Intrusion Detection
icon_mobile_dropdown
Intrusion signature creation via clustering anomalies
Current practices for combating cyber attacks typically use Intrusion Detection Systems (IDSs) to detect and block multistage attacks. Because of the speed and impacts of new types of cyber attacks, current IDSs are limited in providing accurate detection while reliably adapting to new attacks. In signature-based IDS systems, this limitation is made apparent by the latency from day zero of an attack to the creation of an appropriate signature. This work hypothesizes that this latency can be shortened by creating signatures via anomaly-based algorithms. A hybrid supervised and unsupervised clustering algorithm is proposed for new signature creation. These new signatures created in real-time would take effect immediately, ideally detecting new attacks. This work first investigates a modified density-based clustering algorithm as an IDS, with its strengths and weaknesses identified. A signature creation algorithm leveraging the summarizing abilities of clustering is investigated. Lessons learned from the supervised signature creation are then leveraged for the development of unsupervised real-time signature classification. Automating signature creation and classification via clustering is demonstrated as satisfactory but with limitations.
Securing MANETs with BITSI: danger theory and mission continuity
Marco Carvalho, Richard Ford, William Allen, et al.
MANET (Mobile Ad hoc Network) environments are becoming increasingly important as potential users recognize the benefits of being able to create a functional network using little or no fixed infrastructure. Unfortunately, the very properties that provide such flexibility also cause significant complications in terms of security. The collaborative nature of the system combined with its continual state of flux requires solutions that are highly dynamic, and that can adapt to massive changes in system resources, traffic patterns and network topology. In this paper, we outline a new approach to MANET security called BITSI (the Biologically-Inspired Tactical Security Infrastructure). BITSI is based upon the concepts of Artificial Immune Systems and Danger Theory. After introducing the motivations for BITSI we provide a brief description of its underlying theories and proposed architecture. Two experiments conducted within our MANET simulator are described, and we demonstrate that BITSI can detect and respond to certain classes of Denial of Service attacks. Finally, we describe our future plans for BITSI, and how its approach can be combined with other, more traditional, security solutions.
Virtual terrain: a security-based representation of a computer network
Much research has been put forth towards detection, correlating, and prediction of cyber attacks in recent years. As this set of research progresses, there is an increasing need for contextual information of a computer network to provide an accurate situational assessment. Typical approaches adopt contextual information as needed; yet such ad hoc effort may lead to unnecessary or even conflicting features. The concept of virtual terrain is, therefore, developed and investigated in this work. Virtual terrain is a common representation of crucial information about network vulnerabilities, accessibilities, and criticalities. A virtual terrain model encompasses operating systems, firewall rules, running services, missions, user accounts, and network connectivity. It is defined as connected graphs with arc attributes defining dynamic relationships among vertices modeling network entities, such as services, users, and machines. The virtual terrain representation is designed to allow feasible development and maintenance of the model, as well as efficacy in terms of the use of the model. This paper will describe the considerations in developing the virtual terrain schema, exemplary virtual terrain models, and algorithms utilizing the virtual terrain model for situation and threat assessment.
VTAC: virtual terrain assisted impact assessment for cyber attacks
Overwhelming intrusion alerts have made timely response to network security breaches a difficult task. Correlating alerts to produce a higher level view of intrusion state of a network, thus, becomes an essential element in network defense. This work proposes to analyze correlated or grouped alerts and determine their 'impact' to services and users of the network. A network is modeled as 'virtual terrain' where cyber attacks maneuver. Overlaying correlated attack tracks on virtual terrain exhibits the vulnerabilities exploited by each track and the relationships between them and different network entities. The proposed impact assessment algorithm utilizes the graph-based virtual terrain model and combines assessments of damages caused by the attacks. The combined impact scores allow to identify severely damaged network services and affected users. Several scenarios are examined to demonstrate the uses of the proposed Virtual Terrain Assisted Impact Assessment for Cyber Attacks (VTAC).
Usefulness of DARPA dataset for intrusion detection system evaluation
Ciza Thomas, Vishwas Sharma, N. Balakrishnan
The MIT Lincoln Laboratory IDS evaluation methodology is a practical solution in terms of evaluating the performance of Intrusion Detection Systems, which has contributed tremendously to the research progress in that field. The DARPA IDS evaluation dataset has been criticized and considered by many as a very outdated dataset, unable to accommodate the latest trend in attacks. Then naturally the question arises as to whether the detection systems have improved beyond detecting these old level of attacks. If not, is it worth thinking of this dataset as obsolete? The paper presented here tries to provide supporting facts for the use of the DARPA IDS evaluation dataset. The two commonly used signature-based IDSs, Snort and Cisco IDS, and two anomaly detectors, the PHAD and the ALAD, are made use of for this evaluation purpose and the results support the usefulness of DARPA dataset for IDS evaluation.
Miscellaneous Topics
icon_mobile_dropdown
Performance comparison of the automatic data reduction system (ADRS)
Dan Patterson, David Turner, Arturo Concepcion, et al.
In this paper, real data sets from the UCI Repository are mined and quantized to reduce the dimensionality of the feature space for best classification performance. The approach utilized to mine the data is based on the Bayesian Data Reduction Algorithm (BDRA), which has been recently developed into a windows based system by California State University (see http://wiki.csci.csusb.edu/bdra/Main_Page) called the Automatic Data Reduction System (ADRS). The primary contribution of this work will be to demonstrate and compare different approaches to the feature search (e.g., forward versus backward searching), and show how performance is impacted for each data set. Additionally, the performance of the ADRS with the UCI data will be compared to an Artificial Neural Network (ANN). In this case, results are shown for the ANN both with and without the utilization of Principal Components Analysis (PCA) to reduce the dimension of the feature data. Overall, it is shown that the BDRA's performance with the UCI data is superior to that of the ANN.
Two-beam coupling correlation synthetic aperture radar image recognition with power-law scattering centers pre-enhancement
Synthetic radar image recognition is an area of interest for military applications including automatic target recognition, air traffic control, and remote sensing. Here a dynamic range compression two-beam coupling joint transform correlator for detecting synthetic aperture radar (SAR) targets is utilized. The joint input image consists of a pre-power-law, enhanced scattering center of the input image and a linearly synthesized power-law enhanced scattering center template. Enhancing the scattering center of both the synthetic template and the input image furnishes the conditions for achieving dynamic range compression correlation in two-beam coupling. Dynamic range compression: (a) enhances the signal to noise ratio, (b) enhances the high frequencies relative to low frequencies, and (c) converts the noise to high frequency components. This improves the correlation peak intensity to the mean of the surrounding noise significantly. Dynamic range compression correlation has already been demonstrated to outperform many optimal correlation filters in detecting signals in severe noise environments. The performance is evaluated via established metrics, such as peak-to-correlation energy (PCE), Horner efficiency and correlation peak intensity. The results showed significant improvement as the power increased.
Adaptive Markov feature estimation and categorization using the projection-slice theorem
Classification of features extracted by use of the projection-slice theorem and the representation of data streams through a generalized random filed model is investigated. The approach taken here is to generate probability density functions from the data that can be utilized for the generation of transition and emission probabilities allowing an adaptive progression for the Markov Random Field , MRF, model. Because different image variants of the same image are generally collinear, the images are orthogonalized using eigen vectors that correspond to the largest eigenvalues of the covariance matrix representing the image variations. This helps to reduce the dimensionality of the data as well as ensures maximally independent data in the feature selection process. The projection-slice synthetic discriminant functions are utilized to combine the features selected by use of the MRF to reduce a significant amount of data in the generation of the PSDF, the ensuing results are compared to the original data set combined in the PSDF, showing no significant loss in the peak-to-correlation energy performance metric while significant data is removed in the generation of the PSDF.
Miscellaneous Applications
icon_mobile_dropdown
A new approach to chemical agent detection, classification, and estimation
Tao Qian, Genshe Chen, Erik Blasch, et al.
Chemical and biological agent detection has gained a great deal of interest in various applications. We present a new approach to vapor classification and concentration estimation in spacecraft environment. The approach consists of two steps. First, a classifier based on a Support Vector Machine (SVM) is used to identify the presence of toxic vapors. Second, once the vapors are classified, a cubic spline fitting and linear additive model for mixtures based concentration estimation algorithm is used to estimate the concentration of vapor. Once trained, the estimation algorithm can accurately estimate vapor concentrations for both single and mixture vapors under different humidity conditions. Extensive performance evaluations were performed by using e-nose data collected at NASA KCS. We achieved more than 99% accuracy for single vapors and 98% for binary mixture vapors. The classification success rate was 87% using the linear discriminant method. Comparative studies were conducted between the SVM classifier and other classifiers such as Probability Neural Network (PNN) and Learning Vector Quantization (LVQ). In all cases, the SVM classifier showed superior performance over other classifiers. In the concentration estimation part, we achieved less than 3% error in single vapor cases and less than 10% error in mixture cases.
Using received signal strength variation for surveillance in residential areas
Sajid Hussain, Richard Peters, Daniel L. Silver
There are various uses of wireless sensor technology, ranging from medical, to environmental, to military. One possible usage is home security. A wireless sensor network could be used to detect the presence of an intruder. We have investigated the use of Received Signal Strength Indicator (RSSI) values to determine the mobility of an intruder and have found that accurate intruder detection is possible for at least short distances (up to 20 feet). The results of interference monitoring show that a wireless sensor network could be a feasible alternative for security and surveillance of homes.
Power-law radon-transformed superimposed inverse filter synthetic discriminant correlator for facial recognition
A power-law correlation based on an inverse filter Fourier-Radon-transform synthetic discriminant function (SDF) for facial recognition is proposed. In order to avoid spectral overlap and nonlinear crosstalk, superposition of rotationally variant sets of inverse filter Fourier-transformed Radon-processed templates is used to generate the SDF. For the inverse filter, the Fourier transform of M projections (Radon Transform) from one training image is combined with (N-1) M Fourier transform of M projections taken from another N-1 training image. This synthetic SDF filter has a very high discrimination capability; however, it is not noise robust. To overcome this problem, a power-law dynamic range compression is added to the correlation process. The proposed filter has three advantages: (1) high discrimination capability as an inverse filter, (2) noise robustness due to dynamic range compression, and (3) crosstalk-free nonlinear processing. The filter performance was evaluated by established metrics, such as peak-to-correlation energy (PCE), Horner efficiency, and correlation-peak intensity. The results showed significant improvement as the power-law filter compression increased.
Improvement in minority attack detection with skewness in network traffic
Ciza Thomas, N. Balakrishnan
The acceptability and usability of Intrusion Detection Systems get seriously affected with the data skewness in network traffic. A large number of false alarms mean a lot in terms of the acceptability of Intrusion Detection Systems. The reason for the increase in false alerts is that the normal traffic abound. Even with highly accurate Intrusion Detection Systems, the effective detection rate of the minority attack types will be unacceptably low and those attack types are often the most serious ones. Thus high accuracy is not necessarily an indicator of high model quality, and therein lies the accuracy paradox of predictive analytics. The cost of missing an attack is higher than the cost of false alarms. The data-dependent sensor fusion architecture presented in this paper learns from the data and then appropriately gives weighting to the decisions of various Intrusion Detection Systems. The fusion enriches these weighted decisions to provide a single decision, which is better than those of the existing Intrusion Detection Systems. This method reduces the false positive rate and improves the overall detection rate and also the detection rate of minority class types in particular.
Poster Session
icon_mobile_dropdown
An improved clone selection immune algorithm
Peili Qiao, Tong Wang, Jie Su
Antibody had a detecting effect in immune system. Simulating the generating and evolution and working process of the antibody in immune system is the key to build an immune-based intrusion detection system (IDS). This paper proposes a clone selection immune algorithm based on T-cell immunity. In this algorithm we adopt novel genotype and phenotype representations integrated with matching rule, which can show flexibly the 'or' relation between the rules for classifying. Besides, it makes generating detector more effective by introducing negative selection operator.
An immunity-based model for dynamic distributed intrusion detection
Peili Qiao, Tong Wang, Jie Su
The traditional intrusion detection systems mostly adopt the analysis engine of the concentrating type, so the misinformation rate is higher and lack of self-adaptability, which is already difficult to meet increasing extensive security demand of the distributed network environment. An immunity-based model combining immune theory, data mining and data fusion technique for dynamic distributed intrusion detection is proposed in this paper. This system presents the method of establishing and evolving the set of early gene, and defines the sets of Self, Nonself and Immunity cells. Moreover, a detailed description is given to the architecture and work mechanism of the model, and the characters of the model are analyzed.
Research on parallel algorithm for sequential pattern mining
Lijuan Zhou, Bai Qin, Yu Wang, et al.
Sequential pattern mining is the mining of frequent sequences related to time or other orders from the sequence database. Its initial motivation is to discover the laws of customer purchasing in a time section by finding the frequent sequences. In recent years, sequential pattern mining has become an important direction of data mining, and its application field has not been confined to the business database and has extended to new data sources such as Web and advanced science fields such as DNA analysis. The data of sequential pattern mining has characteristics as follows: mass data amount and distributed storage. Most existing sequential pattern mining algorithms haven't considered the above-mentioned characteristics synthetically. According to the traits mentioned above and combining the parallel theory, this paper puts forward a new distributed parallel algorithm SPP(Sequential Pattern Parallel). The algorithm abides by the principal of pattern reduction and utilizes the divide-and-conquer strategy for parallelization. The first parallel task is to construct frequent item sets applying frequent concept and search space partition theory and the second task is to structure frequent sequences using the depth-first search method at each processor. The algorithm only needs to access the database twice and doesn't generate the candidated sequences, which abates the access time and improves the mining efficiency. Based on the random data generation procedure and different information structure designed, this paper simulated the SPP algorithm in a concrete parallel environment and implemented the AprioriAll algorithm. The experiments demonstrate that compared with AprioriAll, the SPP algorithm had excellent speedup factor and efficiency.