3:30pm Wednesday 24 April 2019, McGannon 144
, Naval War College
White Hat Hacking as National Security: The Emerging Global System
Underneath the emerging global system of top-down cyber command structures and government espionage agencies there exists an emergent community of security researchers--white hat hackers. White hats conduct work that is distinctly different from their grey and black hat kin. This talk introduces the audience to the global white hat phenomenon and what states are doing with their domestic hacking talent. Emphasis will be placed on the US as a model, and additional insights will be provided by looking at Russia, China, and India.
3:00pm Thursday 11 April 2019, Ritter 236
, Università Di Bologna
Software Defined Infrastructure Management: an Intent-based approach
Compared to traditional end-to-end service deployment, which requires time-consuming configuration and management task on vendor-specific appliances, the adoption of software-based solutions in networking infrastructures has opened new opportunities to enhance flexibility and reduce management burden. However, to successfully achieve those benefits, the definition of an intuitive, vendor-agnostic and interoperable Northbound Interface (NBI) is key. Such standard definition is not available yet and its implementation is left completely open. We have investigated the definition, prototype implementation and performance analysis of an intent-based NBI for service management and orchestration. This talk explores the adoption of the so-called intent-based approach applied to the interfaces between: i) the cloud orchestration system and the software defined components, and ii) the network administrator and/or business manager and the management system. The approach proved to be feasible and shows scalability potentials. However, results suggest that further investigation, both in terms of design strategy and performance evaluation is required.
3:10pm Monday 8 April 2019, Ritter 115
, SLU Alumnus
Models as a Service
This talk will discuss how to containerize and expose Data Science models for integration into a micro-services environment.
3:00pm Thursday 4 April 2019, Ritter 236
, University of Missouri-Columbia
Reliable Service Chain Orchestration For Scalable Data-intensive Computing At Infrastructure Edges
In the event of natural or man-made disasters, geospatial video analytics is valuable to provide situational awareness that can be extremely helpful for first responders. However, geospatial video analytics demands massive imagery/video data ‘collection’ from Internet- of-Things (IoT) and their seamless ‘computation/consumption’ within a geo-distributed (edge/core) cloud infrastructure in order to cater to user Quality of Experience (QoE) expectations. Thus, the edge computing needs to be designed with a reliable performance while interfacing with the core cloud to run computer vision algorithms. This is because infrastructure edges near locations generating imagery/video content are rarely equipped with high-performance computation capabilities. This thesis addresses challenges of interfacing edge and core cloud computing within the geo-distributed infrastructure as a novel ‘function-centric computing’ paradigm that brings new insights to computer vision, edge routing and network virtualization areas. Specifically, we detail the state-of-the-art techniques and illustrate our new/improved solution approaches based on function-centric computing for the two problems of: (i) high-throughput data collection from IoT devices at the wireless edge, and (ii) seamless data computation/consumption within the geo-distributed (edge/core) cloud infrastructure. To address (i), we present a novel deep learning-augmented geographic edge routing that relies on physical area knowledge obtained from satellite imagery. To address (ii), we describe a novel reliable service chain orchestration framework that builds upon microservices and utilizes a novel ‘metapath composite variable’ approach supported by a constrained-shortest path finder. Finally, we show both analytically and empirically, how our geographic routing, constrained shortest path finder and reliable service chain orchestration approaches that compose our function-centric computing framework are superior than many traditional and state-of-the-art techniques. As a result, we can significantly speedup (up to 4 times) data-intensive computing at infrastructure edges fostering effective disaster relief coordination to save lives.
3:10pm Monday 25 March 2019, Ritter 115
, Washington University
Practical machine learning
Now more than ever, machine learning has been made approachable through numerous libraries and apis. I’ll show why if you can code, you can learn enough machine learning to be dangerous. And I’ll share lessons learned from implementing machine learning in a variety of fields from patent law to cancer genomics.
3:00pm Thursday 7 March 2019, Ritter 216
, SLU Visiting PhD Student
Implementation of service function chaining control plane through OpenFlow
This papers describes a proof-of-concept implementation of the Service Function Chaining Control Plane, exploiting the IETF Network Service Header approach. The proposed implementation combines the OpenFlow protocol to control and configure the network nodes and the NSH method to adapt the service requirements to the transport technology. The manuscript shows that the result of this combination is a very general architecture that may be used to implement any sort of Service Function Chain with great flexibility.
3:10pm Monday 4 March 2019, Ritter 115
, George Washington University
TraffickCam: Deep Learning and Image Search to Combat Human Trafficking
Victims of sex trafficking are often photographed in hotel rooms for online advertisements of sex services. Identifying the hotels in these photographs is a top priority for trafficking investigators and prosecutors — they show where a victim has been trafficked previously and where their trafficker may move them or others in the future. We propose recognizing the hotels in these photographs as an image search problem, where the most likely hotel is inferred from the most similar images of hotel rooms. This is, however, a challenging image search problem, due both to the properties of the victim photographs, which include unusual viewpoints and large occlusions, and the properties of hotel rooms, which may be visually dissimilar within the same hotel, but visually similar across different hotels, particularly those from the same chain. My research has focused on deep learning approaches to large scale image search that are robust to such challenging properties, as well as visualization approaches that explain why deep learning models trained on such image similarity problems find particular images to be similar. TraffickCam, our mobile application to collect images of hotel rooms from the traveling public is currently used by over 150,000 individuals who upload photos specifically to help combat trafficking. The data from this app, combined with millions of publicly available images from travel websites, support a first in the world system for image search to identify hotels in trafficking imagery that we have deployed at the National Center for Missing and Exploited Children.
3:00pm Thursday 28 February 2019, Ritter 216
, SLU Visiting Scholar
ZipNet-GAN: Inferring Fine-grained Mobile Traffic Patterns via a Generative Adversarial Neural Network
This talk will present a summary of a recent paper by Zhang, Ouyang, and Patras
3:00pm Thursday 21 February 2019, Ritter 216
, SLU Visiting Scholar
An Introduction to Network Simulator 3
NS3 is one of the largest open-source computer network simulators. Dr. Pecorella has been "git pushing" contributions to the project for many years. He also has a project within Google Summer of Code on it.
3:10pm Monday 18 February 2019, Ritter 115
, LSU Center for Computation & Technology
Understanding the Tree of Life: computational approaches to unraveling the relationships between all living things
Despite a deluge of genomic sequence data pouring into data repositories in recent years, reconstructing the phylogenetic relationships that unite all lineages (the tree of life) remains a grand challenge in biology. Our ability to collect genetic sequence data has exceeded our ability to analyze that data in an informative way using traditional tools. Yet, despite collecting large amounts of data, there is still a paucity of homologous character data across disparately related lineages, rendering direct phylogenetic inference untenable in many cases. Our recent work leverages computational methods such as graph theory, statistical modeling, and high-performance computing to find solutions to these challenges. By using graph-based approaches to synthesize published phylogenies together with taxonomic classifications we have been able to generate the first "knowledge complete" draft tree of life with over 2.3 million species. Interestingly, this tree highlighted many deficiencies in our current understanding and has led to additional questions about the quality of data and methods used to generate these trees. Follow up analyses suggest that the information content in Genbank (containing most of the published sequence data in the world) is quite low, suggesting many researchers are repeating studies on the same questions with different types of data. Adding to the complexity of this challenge, many of these studies reach different, but equally strong conclusions as different types of data are added. Our explorations into the underlying cause of these challenges hint that these inconsistencies often seem to be driven by poor model adequacy and strongly influential outliers in the data sets. These findings show that one of the biggest hurdles to achieving one of the grandest goals in biology, quantifying all of life, is driven more by by a lack of appropriate models and analysis methods, than by a lack of data.
3:30pm Thursday 14 February 2019, Ritter 115
, DiscernDx Inc.
Bioinformatics for Mass Spectrometry-based Metabolomics/Lipidomics and a Novel Bayesian Approach for Systems-level Inference
The quantitation of metabolites (the intermediates of metabolic processes that enable life) from biological samples provides deep characterization of cell, tissue, and organism phenotypes. While untargeted mass spectrometry is a robust analytical platform for detecting and providing relative quantification of metabolites and lipids (a subclass of metabolites), the data generated from mass spectrometers is complex and multi-dimensional. In this talk, we will first discuss bioinformatic approaches for processing raw Liquid Chromatography–Mass Spectrometry (LC-MS) data to yield useful datasets for conducting molecular biology research. We will then discuss a novel Bayesian methodology that we have developed for making systems-level inference using processed LC-MS data. This methodology utilizes informative priors that are generated via the analysis of molecular structure to enable the estimation of metabolite “interactomes” (or probabilistic models) which are organism, sample media, and condition specific as well as comprehensive. The generated interactomes can serve as reference models for studying perturbations in metabolic processes. We will briefly discuss the software we developed for implementing the methodology and computational optimization of the underlying linear algebra routines. In addition to evaluating the performance of the developed methodology via simulation, we will discuss an application of the methodology to developing a plasma metabolite interactome for stable heart disease. The metabolite and lipid data was generated from plasma samples from human subjects who participated in a study of stable heart disease and myocardial infarction (heart attack) at two hospitals in Louisville, Kentucky.
3:10pm Monday 11 February 2019, Ritter 115
Writing Cleaner, More Sustainable Code with Test-Driven Development
3:10pm Wednesday 6 February 2019, Ritter 115
, U. Missouri-Columbia
Improving Prediction of Protein Three-dimensional Structure using Deep Learning Techniques
Protein structure prediction is one of the most important scientific problems in bioinformatics and computational biology field. The availability of protein three-dimensional (3D) structure is crucial for studying biochemical, biological and cellular functions of proteins. Deep learning techniques have emerged as one of the most effective machine learning methods in recent years and brought revolutionary advances in computer vision, speech recognition and bioinformatics. In this talk, I will first introduce the fundamental ideas and algorithms behind deep learning and explain how the deep learning methods can be applied to protein structure prediction. Then I will present my latest research of applying deep learning techniques to tackle three major sub-problems in protein structure prediction, including protein secondary structure prediction, protein fold recognition and protein quality assessment. Finally, I will describe how to integrate all these methods to improve the prediction of protein three-dimensional structures. The methods were officially ranked among the top three out of 98 predictors in the category of structure prediction and estimating the accuracy of protein structural models in the 13th world-wide Critical Assessment of Techniques for Protein Structure Prediction (CASP13) competition, demonstrating the importance and significance of deep learning techniques in the protein structure prediction. The rigorous evaluation of these methods during the CASP13 (2018) will also be discussed in this talk.
3:10pm Monday 4 February 2019, Ritter 115
, Washington University
Management and Security of Multi-cloud Applications
Single cloud platforms like Amazon’s EC2 and Microsoft Azure are common and popular today. Obtaining resources from multiple cloud systems gives clients competitive pricing, flexibility of resource provisioning, better points of presence and reduced risk of a total blackout. When these clients happen to be carriers contemplating to host their offerings over multiple clouds, there still are many research challenges that inhibit large-scale deployments. This talk revolves around some of the key issues that werethere in the 'to do' list at the beginning of the network virtualization journey and still need considerable attention on the part of the research community to seeany kind of resolution in near future. In this talk I will discuss some methods that can successfully handle optimized placement of virtual networking resources, improve availability of virtual network services and secure flow of data in the context of IoT and multi-cloud based health networks.
3:10pm Monday 28 January 2019, Ritter 115
, Saint Louis University
Toward Effective Visualization of Network Identifier Bindings in a Software-Defined Network
Software-defined networking allows network programmers to enforce dynamic, role-based access control policies on high-level information such as usernames at the network level. In order to enforce a policy based on a username, the system also needs knowledge of a network device’s hostname, IP address, and MAC address. We refer to the relationships between two network identifiers as bindings. This presentation will discuss employing different visualization techniques to help an analyst efficiently understand the dynamic states of those bindings in an enterprise network. No background in networks or visualization is assumed.
3:10pm Monday 26 November 2018, Ritter 115
, Saint Louis University
The Shape of Data
We will explore the fundamentals of TDA (topological data analysis) which utilizes techniques from topology to find structural information about data. TDA lies in the intersection of computer science, mathematics, statistics and data science and has been applied to solve problems in a broad range of domains. In particular, we will examine techniques to determine the underlying shape and its properties from finite data sets. No background in topology is assumed.
3:10pm Monday 19 November 2018, Ritter 115
, Saint Louis University
Computer Science for Bioinformatics: Use case of Apache Spark and Machine Learning in Bioinformatics
This will be a gentle introduction to bioinformatics research topics from the angle of computer scientist. After brief introduction of bioinformatics, I will present how Apache Spark and Machine Learning used on my large-scale data science projects to make a synergy.
3:10pm Monday 5 November 2018, Ritter 115
, Saint Louis University
Computing Optimal Homotopies
The question of how to measure similarity between curves in various settings has received much attention recently, motivated by applications in GIS data analysis, medical imaging, and computer graphics. While geometric measures such as the Hausdorff and Frechet distance have efficient algorithms, measures that take into accoutn the underlying topology of the space are much newer. In this talk, we will consider using homotopy, or continuous deformation of one curve to another, in order to quantify how similar two input curves are. In particular, we will survey recent work on computing optimal homotopies in a variety of settings, discussing both algorithms and the complexity of the problem in general.
3:10pm Monday 29 October 2018, Ritter 115
, Princeton University
For several years researchers have used the term "network orchestration" as a metaphor. In this project, we make the metaphor reality; we describe a novel approach to network orchestration that leverages sounds to augment or replace various network management operations. We test our Music-Defined Networking approach with both a real and a virtual network testbed, on several mechanisms and applications: from datacenter server fan failure detection to authentication, from load balancing to explicit congestion notification and detection of heavy hitter flows. Our approach can be used with and without a Software-Defined Network controller. Despite its limitations, we believe that sound-based network management has potential to be further explored as an effective and inexpensive out-of-band orchestration technique.
3:10pm Monday 8 October 2018, Ritter 115
, Saint Louis University
Securing the Future Internet: DDoS Prevention, Anonymity, and Access Control in NDN
3:10pm Monday 1 October 2018, Ritter 115
, Politecnico di Bari
Learning From the Past to Build a Better Internet Architecture: Is Information Centric Networking the Answer?
The current Internet architecture was born in 1971 as an academic network of fixed and trustworthy hosts, to allow the communication among the scientific community. Its usage radically changed in the last few years: it is now a global infrastructure for a massive distribution of information generated by billions of (mobile) users. To cope with the constant Internet evolution and to reduce the complexity introduced by cumbersome patches and middleware layers, the scientific community invested millions on new Internet architectures redesign. The Information Centric Networking (ICN) paradigm emerged as one of the most promising approaches. In this talk, we will first investigate some of the current Internet flaws and discuss how they originated. Then we dissect a few characteristics of some of the newly proposed Internet architectures, discussing their potential and limitations.
3:10pm Monday 10 September 2018, Ritter 115
, Saint Louis University
Recent Advances in Neural Language Models
This will be a gentle introduction to the foundational problem of probabilistic language modeling. I'll quickly review some of the traditional (non-neural) approaches and then discuss recent breakthroughs that use deep neural networks to obtain state of the art results for English. I'll close with some of the challenges these developments present for languages with more complicated morphology.