Computer Science

Permanent URI for this collection

https://hdl.handle.net/10315/38508

Browse

Now showing 1 - 20 of 42

Open Access
Enriching Word Representation Learning for Affect Detection and Affect-Aware Recommendations
(2021-03-08) Dehaki, Nastaran Babanejad; An, Aijun; Papangelis, Emmanouil
The role of detecting affects from text is to detect affective states such as mood, sentiment and emotions from textual data. The main affective tasks, including sentiment analysis, emotion classification and sarcasm detection have been popular in recent years due to a broad range of relevant applications in various domains. Traditionally, recommendations deal with applications having only two types of entities, users and items, and do not put them into a context when providing recommendations. Recently, contextual recommendations provide more accurate recommendation by considering more contextual information. However, little attention has been paid to affective context and its relation to recommendations. In this dissertation, we first investigate the impact of using affective information on the quality of recommendations, and then seek to improve affect detection in text by enhancing word representation learning. We enrich word representations in two ways: one by effective pre-processing of training word embeddings and second by incorporating both affective and contextual features deeply into text representations. We demonstrate the benefits of enriched word representations in both affect detection and affect-aware recommendation tasks. This dissertation consists of five contributions. First, we investigate whether, and to what extent emotion features can improve recommendations. Towards that end, we derive a number of emotion features that can be attributed to both items/users in the domain of news and music. Then, we devise state-of-the-art emotion-aware recommendation models by systematically leveraging these features. Second, we study the problem of pre-processing in word representation learning for affective tasks. Most early models of affective tasks employed pre-trained word embedding. While pre-processing in affective systems is well-studied, text pre-processing for training word embeddings is not. To address this limitation, we conduct a comprehensive analysis of the role of text pre-processing techniques in word representation learning for affective analysis by applying each pre-processing technique first at embedding training phase, commonly ignored in pre-trained word vector models, then at the downstream task phase. Third, we investigate the usefulness of customized pre-processing for word representation learning for affective tasks. We argue that using numerous text pre-processing techniques at once as a general combination for all affective tasks decreases the performance of affect detection. Therefore, we conduct extensive experiments, showing that, an appropriate combination of text pre-processing methods for each affective task can significantly enhance the classifiers performance. The fourth contribution is to study the role of affective and contextual embeddings with deep neural network models for affect detection. Early word embedding methods, such as Word2vec, are non-contextual, meaning that a word has the same embedding vector independent of its context and sense. Contextual embedding techniques, such as BERT solve this problem but do not incorporate affect information in their word representations. We propose two novel deep neural network models that extend BERT to incorporate both affective and contextual features in text representations. Lastly, we show the usefulness of our proposed affective and contextual embedding models by applying them to affect-aware recommendations.
Open Access
Utilizing the Transformer Architecture for Question Answering
(2021-03-08) Laskar, Md Tahmid Rahman; Hoque Prince, Enamul; Huang, Xiangji "Jimmy"
The Question Answering (QA) task aims at building systems that can automatically answer a question or query about the given document(s). In this thesis, we utilize the transformer, a state-of-the-art neural architecture to study two QA problems: the answer sentence selection and the answer summary generation. For answer sentence selection, we present two new approaches that rank a list of candidate answers for a given question by utilizing different contextualized embeddings with the encoder of transformer. For answer summary generation, we study the query focused abstractive text summarization task to generate a summary in natural language from the source document(s) for a given query. For this task, we utilize transformer to address the lack of large training datasets issue in single-document scenarios and no labeled training datasets issue in multi-document scenarios. Based on extensive experiments, we observe that our proposed approaches obtain impressive results across several benchmark QA datasets.
Open Access
Exploring the Scalability, Throughput and Security Characteristics of the Tangle Distributed Ledger Technology through Simulation Analysis
(2021-03-08) Madenouei, Nahid Alimohammadi; Liaskos, Sotirios
During the past few years, blockchain technologies have attracted substantial attention from researchers, engineers, and enterprises. These technologies provide decentralized platforms to validate different types of transactions without relying on a central authority. Since the advent of the popular Bitcoin network [2], [3], various types of protocols, among them DAG-based distributed ledgers, have been offered to improve the Bitcoin networks shortcomings in the areas of scalability, throughput, and security. Researchers have also been motivated to study blockchain-based applications across multiple domains with novel designs and capabilities [4]. However, when used in such capacities, failures of blockchain networks (BNs) imply catastrophes that extend beyond individuals, organizations, and countries [5]. In light of such mission criticality, the vision of turning BNs into decentralized transaction processing systems that can sustainably and securely compete with the centralized state of the art poses substantial methodological challenges for researchers [5]. Before considered for wide adoption, BN protocols and technologies must be assessed with different analytical and empirical methodologies. In this research, a novel DAG-based distributed ledger, the Tangle, has been studied and several of its features have been investigated through simulation analysis. Contributions of this research are two-fold. First, a Tangle simulator has been designed and implemented based on model-oriented design and development principles. Second, three important characteristics of Tangle networks including scalability, throughput, and their security against double-spending attacks have been investigated.
Open Access
Single-View 3D Object Shape Estimation
(2021-03-08) Qian, Yiming; Elder, James H.
An accurate single-view 3D reconstruction algorithm would allow researchers to develop better systems in robotics and VR/AR applications. However, single-view 3D reconstruc- tion is an ill-posed problem and thus solvable only for scenes that satisfy strong regularity conditions. Features such as texture variations, haze, colour, shading, known object size and occlusion can provide information about 3D structures. Further, the built environment has repeated patterns and orthogonal straight lines that contain structured information with strong regularities. In this report, I will introduce the research I have done for my PhD on the single-view reconstruction problem in two parts. Part 1 will detail the three research projects I have completed on single-view 3D reconstruction of the built environment: 1) line segment detection [4], 2) single-view geometry-driven road segmentation [5], and 3) single-view 3D reconstruction of Manhattan buildings [158]. Part 2 will detail the research I have done on reconstructing the 3D shape of more general objects from their bounding contours.
Open Access
Thermoreflectance Measurements of Non-Diffusive Transport In Bulk And Nanoscale Materials
(2021-07-06) Shahzadeh, Mohammadreza; Pisana, Simone
Following the trend in miniaturization of devices to the sub-micron scale, thermal management has become increasingly important to device operation. Thermal management becomes even more concerning in nanoscale electronic devices, where there are several interfaces for heat to pass across before reaching a bulk-like heat sink. To mitigate these issues, existing devices are constantly being re-engineered and new materials are being sought to take advantage of their improved performance. Importantly, at small scales or during fast transients, heat transport deviates from the classic diffusive regime and its physics is still being understood. To examine the thermal performance of these structures and materials at different length scales and in different heating dynamics, suitable metrology techniques are needed. In this dissertation, heat transport within bulk and nanoscale materials as well as across interfaces is studied. To achieve this goal, a frequency domain thermoreflectance (FDTR) setup is established and three different extensions to this setup are presented that improve the metrology in nanoscale materials or non-diffusive transport. Having different variations of the FDTR setup enables us to selectively utilize these techniques depending on the sample structure and the questions one is investigating. By measuring different structures, we show that our setups are capable of examining thermophysical properties of different materials ranging from two dimensional to three dimensional materials, from dielectrics to metals, and from thermally isotropic to anisotropic. We, then, turn our attention to non-diffusive heat transport in two different structures: metallic Platinum-Cobalt multilayers, and Tungsten disulfide (WS2) crystal. In the case of metallic Platinum-Cobalt multilayers, we show that as the interface density increases and the layer thickness becomes comparable with the electron mean free path in these metals, a deviation from the diffusive theories governing heat transport in electron-mediated multilayers is observed for the first time. Finally, we show that strong non-diffusive heat transport in WS2 can be observed at room temperature as a function of heat spot size. This not only highlights unique transport features in this material, but also points to the susceptibility to misinterpret experimental data if non-diffusive transport is not considered.
Open Access
Ensuring Fairness Despite Differences in Environment
(2021-07-06) Singh, Karan Deep; Edmonds, Jeff; Urner, Ruth
Several fairness definitions have been proposed in the machine learning literature to rectify the issue of demographic groups being treated differently. Given the substantial research in the field, this work aims to provide an entry-level overview of the common definitions and metrics that are essential for a novice reader in the field. In addition, we propose a theorem, where we look at different population distributions and conditions under which our claim holds, that is the disadvantaged individual is expected to be more talented than the similarly performing advantaged individual. Finally, this work summarizes the six research works and discusses whether the result of our theorem is consistent in each of the research work's model settings, culminating in a discussion of how all the authors view the world in terms of a group's talent distribution.
Open Access
Image Color Correction, Enhancement, and Editing
(2021-07-06) Afifi, Mahmoud Nasser Mohammed; Brown, Michael S.
This thesis presents methods and approaches to image color correction, color enhancement, and color editing. To begin, we study the color correction problem from the standpoint of the camera's image signal processor (ISP). A camera's ISP is hardware that applies a series of in-camera image processing and color manipulation steps, many of which are nonlinear in nature, to render the initial sensor image to its final photo-finished representation saved in the 8-bit standard RGB (sRGB) color space. As white balance (WB) is one of the major procedures applied by the ISP for color correction, this thesis presents two different methods for ISP white balancing. Afterwards, we discuss another scenario of correcting and editing image colors, where we present a set of methods to correct and edit WB settings for images that have been improperly white-balanced by the ISP. Then, we explore another factor that has a significant impact on the quality of camera-rendered colors, in which we outline two different methods to correct exposure errors in camera-rendered images. Lastly, we discuss post-capture auto color editing and manipulation. In particular, we propose auto image recoloring methods to generate different realistic versions of the same camera-rendered image with new colors. Through extensive evaluations, we demonstrate that our methods provide superior solutions compared to existing alternatives targeting color correction, color enhancement, and color editing.
Open Access
Multistage Multiscale Inference Network with Visibility Attention for Occluded Person Re-Identification
(2021-07-06) Kim, Yoon Tae; Wildes, Richard
For occluded person re-identification this thesis presents the Multistage Multiscale Inference Network (MMI-Net) that leverages an inference framework based on multiscale representations with visibility guidance. MMI-Net consists of three sub-networks, i) global, ii) part-based and iii) integrated, to infer person re-identification. The global inference sub-network provides an overall holistic analysis of input images. The part-based sub-network captures more localized information. Both the global and part-based models make use of multiscale representation across multiple processing stages to capture a variety of complementary discriminative image structure. The integrated sub-network aggregates the global and part-based representations to obtain the final fusion of all extracted information. Pose guided attentional processing is used to provide robustness to occlusion. MMI-Net is unique in its integrated multistage inference architecture that accounts for local and global appearance with attentional processing. In empirical evaluation, MMI-Net outperforms current existing methods on multiple occluded person re-identification datasets.
Open Access
Novel Examination of Interpretable Surrogates and Adversarial Robustness in Machine Learning
(2021-07-06) Chowdhury, Sadia; Urner, Ruth
The lack of transparent output behavior is a significant source of mistrust in many of the currently most successful machine learning tools. Concern arises particularly in situations where the data generation changes, for example under marginal shift or under adversarial manipulations. We analyze the use of decision trees (a human interpretable model) for indicating marginal shift. We then investigate the role of the data generation for the validity of the interpretable surrogate and its implementation as both local and global interpretation methods. We often observed that the decision boundaries of the blackbox model was mostly sitting close to the original data manifold. This makes those regions vulnerable to imperceptible perturbations. Hence, we carefully argue that adversarial robustness should be defined as a locally adaptive measure complying with the underlying distribution. We then suggest a definition for an adaptive robust loss, an empirical version of it and a resulting data-augmentation framework.
Open Access
On Solovay's Theorem and a Proof of Arithmetical Completeness of a New Predicate Modal Logic
(2021-11-15) Hao, Yunge; Tourlakis, George; Edmonds, Jeffrey
This thesis investigates a first-order extension of GL called ML3. We briefly discuss the latters properties and some of its toolbox: some metatheorems, the conservation theorem, and its semantic completeness (with respect to finite reverse wellfounded Kripke models). Applying the Solovay technique to those models the thesis establishes its main result, namely, that ML3 is arithmetically complete. As expanded below, ML3 is a first-order modal logic that along with its built-in ability to simulate general classical first-order provability having simulate the metamathematical classical is also arithmetically complete in the Solovay sense. We also carefully reconstruct the proof of Solovays Lemmata in our Appendixes, including a complete mathematically rigorous construction of his graph-walking function h.
Open Access
Learning Sparse Graph Laplacian with K Eigenvector Prior via Iterative GLASSO and Projection
(2021-11-15) Bagheri, Saghar; Cheung, Gene
Learning a suitable graph is an important precursor to many graph signal processing (GSP) tasks, such as graph signal compression and denoising. Previous graph learning algorithms either make assumptions on graph connectivity (e.g., graph sparsity), or make individual edge weight assumptions such as positive edges only. In this thesis, given an empirical covariance matrix computed from data as input, an eigen-structural assumption on the graph Laplacian matrix is considered: the first K eigenvectors of the graph Laplacian are pre-selected, e.g., based on domain-specific criteria, and the remaining eigenvectors are then learned from data. One example use case is image coding, where the first eigenvector is pre-chosen to be constant, regardless of available observed data. Experimental results show that given the first K eigenvectors as a prior, the algorithm in this thesis outperforms competing graph learning schemes using a variety of graph comparison metrics.
Open Access
Comparative Studies of Gesture-Based and Sensor-Based Input Methods for Mobile User Interfaces
(2021-11-15) Garg, Saurabh; MacKenzie, I. Scott
Three user studies were conducted to compare gesture-based and sensor-based interaction methods. The first study compared the efficiency and speed of three scroll navigation methods for touch-screen mobile devices: Tap Scroll (touch-based), Kinetic Scroll (gesture-based), and Fingerprint Scroll (our newly introduced sensor-based method). The second study compared the accuracy and speed of three zoom methods. One method was GyroZoom which uses the mobile phone's gyroscope sensor. The second one is Pinch-to-Zoom (Gesture-based) method. VolumeZoom, the third method, uses volume buttons that were reprogrammed to perform zoom operations. The third study on text entry compared a QWERTY-based onscreen keyboard with a novel 3D gesture-based Write-in-Air method. This method utilizes webcam sensors. Our key findings from the three experiments are that sensor-based interaction methods are intuitive and provide a better user experience than gesture-based interaction methods. The sensor-based methods were on par with the speed and accuracy of the gesture-based methods.
Open Access
Machine Learning Interference Modelling for Cloud-native Applications
(2022-03-03) Baluta, Alexandru; Litoiu, Marin
Modern cloud-native applications use microservice architecture patterns, where fine granular software components are deployed in lightweight containers that run inside cloud virtual machines. To utilize resources more efficiently, containers belonging to different applications are often co-located on the same virtual machine. Co-location can result in software performance degradation due to interference among components competing for resources. In this thesis, we propose techniques to detect and model performance interference. To detect interference at runtime, we train Machine Learning (ML) models prior to deployment using interfering benchmarks and show that the model can be generalized to detect runtime interference from different types of applications. Experimental results in public clouds show that our approach outperforms existing interference detection techniques by 1.35%-66.69%. To quantify the intereference impact, we further propose a ML interference quantification technique. The technique constructs ML models for response time prediction and can dynamically account for changing runtime conditions through the use of a sliding window method. Our technique outperforms baseline and competing techniques by 1.45%-92.04%. These contributions can be beneficial to software architects and software operators when designing, deploying, and operating cloud-native applications.
Open Access
Music-STAR: a Style Translation system for Audio-based Rearrangement
(2022-03-03) Alinoori, Mahshid; Tzerpos, Vassilios
Music style translation has recently gained attention among music processing studies. It aims to generate variations of existing music pieces by altering the style-variant characteristics of the original music piece, while content such as the melody remains unchanged. These alterations could involve timbre translation, reharmonization, or music rearrangement. In this thesis, we plan to address music rearrangement, focusing on instrumentation, by processing waveforms of two-instrument pieces. Previous studies have achieved promising results utilizing time-frequency and symbolic music representations. Music translation on raw audio has also been investigated using single-instrument pieces. Although processing raw audio is more challenging, it embodies more detailed information about the performance, timbre, and dynamics of a music piece. To this end, we introduce Music-STAR, the first audio-based model that can transform the instruments of a multi-track piece into another set of instruments, resulting in a rearranged piece.
Open Access
Fast Similarity Graph Construction via Data Sketching Techniques
(2022-03-03) Marefat. Hoorieh; An, Aijun; Papangelis, Manos
Graphs are mathematical structures used to model objects and their pairwise relationships. Due to their simple but expressive abstract representation, they are commonly used to model various types of relations and processes in technological, social or biological systems and have found numerous applications. A special type of graph is the similarity graph in which nodes represent entities and there is an edge connecting two nodes if the two entities are similar based on some similarity measure. In a typical scenario, raw data of entities are provided in the form of a relational dataset, matrix or a tensor and a similarity graph is built to facilitate graph-based analysis like node importance, node classification, link prediction, community detection, outlier detection, and more. The ability to construct similarity graphs fast is important and with a potential for high impact, thus several approximation techniques have been proposed. In this work, we propose data sketching based methods for fast approximate similarity graph construction. Data sketching techniques are applied on the raw data and are designed to achieve desired error guarantees. They can drastically reduce the size of raw data on which we operate, allowing for faster construction and analysis of similarity graphs, but with approximate results. This is a desirable tradeoff for many applications in diverse domains. Through a thorough experimental evaluation, we demonstrate that our sketching methods outperform sensible baselines and competitor methods proposed for the problem. First, they are much faster than exact methods while maintaining high accuracy in constructing the similarity graph. Furthermore, our methods demonstrate significantly higher accuracy than competitive methods on generic graph analysis tasks. We demonstrate the effectiveness of our methods on different real-world graph applications.
Open Access
Design Justice Principles and Do-It-Yourself Assistive Technology: Case Study
(2022-03-03) Poustizadeh, Mana; Baljko, Melanie
In this project, we focus on the Principles of Design Justice, as developed by the Design Justice Network, a community committed to challenging structural inequalities of design. Our thesis research project is aligned with the premise of user-centered design and the situated knowledge in third paradigm of HCI. We examine some of the current processes for Do-It-Yourself Assistive Technology (DIY-AT) development and deployment using the works of Makers Making Change (MMC). MMC connects the makers of DIY-AT devices to people who need AT devices. We also examine the impacts of the ongoing COVID-19 pandemic on the need for DIY-AT and the challenges it might have caused. Our findings include MMC's positive impact regarding DIY-AT service delivery, engaging local makers into making DIY-AT, and a modest job in integrating Design Justice Principles. The findings of our study also suggest an increase in the demand for AT due to the pandemic.
Open Access
Neural-based Knowledge Transfer in Natural Language Processing
(2022-03-03) Wang, Chao; Jiang, Hui
In Natural Language Processing (NLP), neural-based knowledge transfer, which is to transfer out-of-domain (OOD) knowledge to task-specific neural networks, has been applied to many NLP tasks. To further explore neural-based knowledge transfer in NLP, in this dissertation, we consider both structured OOD knowledge and unstructured OOD knowledge, and deal with several representative NLP tasks. For structured OOD knowledge, we study the neural-based knowledge transfer in Machine Reading Comprehension (MRC). In single-passage MRC tasks, to bridge the gap between MRC models and human beings, which is mainly reflected in the hunger for data and the robustness to noise, we integrate the neural networks of MRC models with the general knowledge of human beings embodied in knowledge bases. On the one hand, we propose a data enrichment method, which uses WordNet to extract inter-word semantic connections as general knowledge from each given passage-question pair. On the other hand, we propose a novel MRC model named Knowledge Aided Reader (KAR), which explicitly uses the above extracted general knowledge to assist its attention mechanisms. According to the experimental results, KAR is comparable in performance with the state-of-the-art MRC models, and significantly more robust to noise than them. On top of that, when only a subset (20%-80%) of the training examples are available, KAR outperforms the state-of-the-art MRC models by a large margin, and is still reasonably robust to noise. In multi-hop MRC tasks, to probe the strength of Graph Neural Networks (GNNs), we propose a novel multi-hop MRC model named Graph Aided Reader (GAR), which uses GNN methods to perform multi-hop reasoning, but is free of any pre-trained language model and completely end-to-end. For graph construction, GAR utilizes the topic-referencing relations between passages and the entity-sharing relations between sentences, which is aimed at obtaining the most sensible reasoning clues. For message passing, GAR simulates a top-down reasoning and a bottom-up reasoning, which is aimed at making the best use of the above obtained reasoning clues. According to the experimental results, GAR even outperforms several competitors relying on pre-trained language models and filter-reader pipelines, which implies that GAR benefits a lot from its GNN methods. On this basis, GAR can further benefit from applying pre-trained language models, but pre-trained language models can mainly facilitate the within-passage reasoning rather than cross-passage reasoning of GAR. Moreover, compared with the competitors constructed as filter-reader pipelines, GAR is not only easier to train, but also more applicable to the low-resource cases. For unstructured OOD knowledge, we study the neural-based knowledge transfer in Natural Language Understanding (NLU), and focus on the neural-based knowledge transfer between languages, which is also known as Cross-Lingual Transfer Learning (CLTL). To facilitate the CLTL of NLU models, especially the CLTL between distant languages, we propose a novel CLTL model named Translation Aided Language Learner (TALL), where CLTL is integrated with Machine Translation (MT). Specifically, we adopt a pre-trained multilingual language model as our baseline model, and construct TALL by appending a decoder to it. On this basis, we directly fine-tune the baseline model as an NLU model to conduct CLTL, but put TALL through an MT-oriented pre-training before its NLU-oriented fine-tuning. To make use of unannotated data, we implement the recently proposed Unsupervised Machine Translation (UMT) technique in the MT-oriented pre-training of TALL. According to the experimental results, the application of UMT enables TALL to consistently achieve better CLTL performance than the baseline model without using more annotated data, and the performance gain is relatively prominent in the case of distant languages.
Open Access
Misinformation Identification Using Natural Language Processing
(2022-08-08) Ou, Jia Ying; Nguyen, Uyen T.
The popularity of social media has accelerated the speed and scope of fake news propagation, and exacerbated the harm caused by false information. Identifying misinformation is crucial to maintain a countrys political, social, financial stability and democracy. In this thesis, we study the problem of misinformation identification using natural language processing (NLP). Given a claim, our approach classifies a claim as true, partly true or false using a set of news articles whose contents are related to the claim. The set of related articles, collected from reputable sources, serves as the ground truth to assess the validity of the claim. Using this approach of misinformation identification, the contributions of this thesis is to address the following research problems: 1. We constructed a new large-scale, feature-rich dataset of COVID-19 news and facts for research on COVID-19 misinformation, which is named COVMIS. We provide a comprehensive analysis of the dataset to better understand the data, including claim contents, article contents, publication dates, news sources, and country distribution. We also discuss potential use cases to demonstrate the benefits of the dataset for research on misinformation-related COVID-19 and other areas. 2. We conducted two sets of extensive experiments to evaluate several state-of-the-art transformer-based NLP models using the COVMIS dataset. The models that were evaluated are BERT (Bidirectional Encoder Representations from Transformers), DistilBERT, XLNet (Generalized Autoregressive Pretraining for Language Understanding), ALBERT (A Lite BERT), and RoBERTa (Robustly Optimized BERT Pre-training Approach). The first set of experiments shows that BERT performs the best in terms of F1 score. In the second set of experiments, we evaluated an optimization: instead of inputting all articles related to a claim to classify the claim, we extracted and input only a subset of K sentences (e.g., K = 5) that are the most relevant to the claim. Experimental results show that this optimization improves the performance of the models in terms of accuracy, F1 score, precision and recall, given different values of K. 3. We conducted two sets of extensive experiments on a news classification model based on BERT and evaluated the performance of the model in terms of accuracy, F1 score, precision, and recall. We used two datasets: (i) the general news dataset provided by the Fake News Challenge competition and (ii) the COVMIS dataset mentioned above. The first set of experiments was designed to answer the question of whether narrowing down the domain of knowledge (i.e., COVID-related news vs. general news) will improve the classification performance. Our experimental results show that the classification performance of the model improves significantly when the domain of knowledge of the dataset is narrowed down to a specific area of interest, COVID-19 in this case. The second set of experiments quantified how obsolete training data affect the classification performance. Our experimental results show that the more up-to-date the training data (relative to the test data), the better the classification performance.
Open Access
Graph Attention Networks for Graph Learning and Its Applications
(2022-08-08) Zhu, Runjie; Huang, Jimmy
This thesis addresses and investigates the recent development of graph attention network (GAT) models in the three following aspects: (1) GATs on single graph learning via the Knowledge Graph Embeddings (KGE) task, (2) GATs on multiple graph learning via the Cross-lingual Entity Alignment (CEA) task, and (3) GATs on on-going real-world problems via the COVID-19 node classification task. These three aspects of research complement each other in a way that cover a wide range of graph learning tasks to prove the effectiveness and robustness of GAT-based models. First, GAT has demonstrated its strengths in the KGE task recently. Although GAT has proven to be promising in achieving the state-of-the-art (SOTA) performance in KGE, the performance of current GAT-based models is still largely restrained. In this thesis, we propose a novel bidirectional graph attention network (BiGAT) which leverages GATs to learn hierarchical neighbor propagation in a bidirectional manner. Second, past studies of multiple graph learning for CEA tend to use traditional approaches to find equivalent entities in the counterpart knowledge graph (KG). These traditional methods tend to miss important structural information beyond entities in the modeling process. Many GNN-based models model KGs independently for embedding learning. Moreover, they tend to either underrate the usefulness of pre-aligned links or utilize only a few pre-aligned entities to connect different KGE spaces. These characteristics largely restrain model performances. To address these issues, we propose two novel GAT-based models, Contextual Alignment Enhanced Cross Graph Attention Network (CAECGAT) and Dual Gated Graph Attention Network with Dynamic Iterative Training (DuGa-DIT), to effectively learn embeddings from different KGs, to capture more neighborhood features, and to propagate more significant cross-KG information through pre-aligned seed alignments. Third, recent studies on graph learning for COVID-19 have shown the possibility of leveraging deep-learning models to classify infected cell nodes. Thus, another important part of this thesis dives into designing an effective GAT-based model for node classification. Our proposed method, graph attention capsule network (GACapNet), has delivered significantly better results than baseline models. Moreover, our study could also indicate predictive features to help close existing knowledge gaps in the pathogenesis of COVID-19 pneumonia.
Open Access
Machine Learning-based Intrusion Detection Under Adversarial Influence: Application to MAC Spoofing Detection in IoT Networks
(2022-08-08) Madani Kochak, Seyed Pooria; Vlajic, Natalija
Internet of things (IoT) has brought a greater prevalence of smart objects with higher connectivity between them. Today, there are millions of such smart devices controlling critical infrastructures, such as nuclear power plants, and have brought a new form of comfort to our home through smart appliances. As such, IoT devices have become valuable targets for (potentially state-sponsored) adversaries that are most often after cyberattacks with physical ramifications. Taking over a trusted nodes identity (also known as identity spoofing) in an IoT network can enable more sophisticated multi-tier attacks against other nodes/resources in the same network. Thus, detecting identity spoofing attacks must be part of any sound defensive measure when protecting IoT networks. Several learning-based detection schemes have been proposed in the literature that attempt to detect identity attacks (i.e., MAC Spoofing) in wireless networks. However, the proposed learning-based methods are highly susceptible to adversarial evasion attacks - one of the main theme studied in this manuscript - and a sophisticated adversary could circumvent detection by identifying "blind spots" in the learning algorithms of proposed approaches. In this dissertation, we have extensively studied the use of randomization (another major theme) both from defensive and offensive perspectives to add robustness to existing learning-based MAC spoofing detection methods. Specifically, we have proposed a randomization scheme that can be added to the existing learning-based detection approaches to increase the uncertainty of the adversary in the search of finding an optimal evasion strategy. Moreover, we have also proposed an adversarial search approach based on active learning that an adversary could use to mount an optimal evasion attack against detection classifiers that utilize randomization. Finally, we have proposed a novel multi-model MAC spoofing detection system based on deep autoencoders, which have been specifically designed and tested for IoT networks deployed in adversarial settings by taking into account environmental variabilities induced by moving objects.