Computer Science

Permanent URI for this collection

https://hdl.handle.net/10315/38508

Browse

Now showing 1 - 20 of 42

Open Access
A System for Plan Recognition in Discrete and Continuous Domains
(2022-08-08) Scheuhammer, Alistair Sterling Benger; Lesperance, Yves
For my thesis I seek to implement a programming framework which can be used to model and solve plan recognition problems. My primary goal for this system is for it to be able to easily handle continuous probability spaces as well as discrete ones. My framework is based primarily on the probabilistic situation calculus developed by Belle and Levesque, and is an extension of a programming language developed by Levesque called Ergo. The system I have built allows one to specify complex domains and dynamic models at a high-level and is written in a language which is user-friendly and easy to understand. It has strong formal foundations, can be used to compare multiple different plan recognition methods, and makes it easier to perform plan recognition in tandem with other forms of reasoning, such as threat assessment, reasoning about action, and planning to respond to the actions performed by the observed agent.
Open Access
A Wait-free Queue with Poly-logarithmic Worst-case Step Complexity
(2023-03-28) Naderibeni, Hossein; Ruppert, Eric
In this work, we introduce a novel linearizable wait-free queue implementation. Linearizability and lock-freedom are standard requirements for designing shared data structures. To the best of our knowledge, all of the existing linearizable lock-free queues in the literature have a common problem in their worst case, called the CAS Retry Problem. We show that our algorithm avoids this problem with the helping mechanism which we use and has a worst-case running time better than prior lock-free queues. The amortized number of steps for an Enqueue or Dequeue in our algorithm is O(log^2 p + log q), where p is the number of processes and q is the size of the queue when the operation is linearized.
Open Access
Active Observers in a 3D World: Human Visual Behaviours for Active Vision
(2022-12-14) Solbach, Markus Dieter; Tsotsos, John K.
Human-like performance in computational vision systems is yet to be achieved. In fact, human-like visuospatial behaviours are not well understood – a crucial capability for any robotic system whose role is to be a real assistant. This dissertation examines human visual behaviours involved in solving a well-known visual task; The Same-Different Task. It is used as a probe to explore the space of active human observation during visual problem-solving. It asks a simple question: “are two objects the same?”. To study this question, we created a set of novel objects with known complexity to push the boundaries of the human visual system. We wanted to examine these behaviours as opposed to the static, 2D, display-driven experiments done to date. We thus needed to develop a complete infrastructure for an experimental investigation using 3D objects and active, free, human observers. We have built a novel, psychophysical experimental setup that allows for precise and synchronized gaze and head-pose tracking to analyze subjects performing the task. To the best of our knowledge, no other system provides the same characteristics. We have collected detailed, first-of-its-kind data of humans performing a visuospatial task in hundreds of experiments. We present an in-depth analysis of different metrics of humans solving this task, who demonstrated up to 100% accuracy for specific settings and that no trial used less than six fixations. We provide a complexity analysis that reveals human performance in solving this task is about O(n), where n is the size of the object. Furthermore, we discovered that our subjects used many different visuospatial strategies and showed that they are deployed dynamically. Strikingly, no learning effect was observed that affected the accuracy. With this extensive and unique data set, we addressed its computational counterpart. We used reinforcement learning to learn the three-dimensional same-different task and discovered crucial limitations which only were overcome if the task was simplified to the point of trivialization. Lastly, we formalized a set of suggestions to inform the enhancement of existing machine learning methods based on our findings from the human experiments and multiple tests we performed with modern machine learning methods.
Open Access
Analyizing Color Imaging Failure on Consumer Cameras
(2022-12-14) Tedla, SaiKiran Kumar; Brown, Michael S.
There are currently many efforts to use consumer-grade cameras for home-based health and wellness monitoring. Such applications rely on users to use their personal cameras to capture images for analysis in a home environment. When color is a primary feature for diagnostic algorithms, the camera requires color calibration to ensure accurate color measurements. Given the importance of such diagnostic tests for the users' health and well-being, it is important to understand the conditions in which color calibration may fail. To this end, we analyzed a wide range of camera sensors and environmental lighting to determine (1): how often color calibration failure is likely to occur; and (2) the underlying reasons for failure. Our analysis shows that in well-lit environments, it is rare to encounter a camera sensor and lighting condition combination that results in color imaging failure. Moreover, when color imaging does fail, the cause is almost always attributed to spectral poor environmental lighting and not the camera sensor. We believe this finding is useful for scientists and engineers developing color-based applications with consumer-grade cameras.
Open Access
Chart Question Answering with Visual and Logical Reasoning
(2022-12-14) Masry, Ahmed; Prince, Enamul Hoque
Charts are very popular for analyzing data. When exploring charts, people often ask complex reasoning questions that involve several logical and arithmetic operations. They also commonly refer to visual features of a chart in their questions. However, most existing datasets do not focus on such complex reasoning questions as their questions are template-based and answers come from a fixed-vocabulary. In this thesis work, we present a large-scale benchmark covering 9.6K human-written questions and 23.1K questions generated from human-written chart summaries. To address the unique challenges in our benchmark involving visual and logical reasoning, we present transformer-based models that combine visual features and the data table of the chart. Moreover, we propose chart-specific pretraining tasks that improve the visual and logical reasoning skills of our models. While our models achieve the state-of-the-art results on the previous datasets and our benchmark, the evaluation also reveals several challenges in answering complex reasoning questions.
Open Access
Comparative Studies of Gesture-Based and Sensor-Based Input Methods for Mobile User Interfaces
(2021-11-15) Garg, Saurabh; MacKenzie, I. Scott
Three user studies were conducted to compare gesture-based and sensor-based interaction methods. The first study compared the efficiency and speed of three scroll navigation methods for touch-screen mobile devices: Tap Scroll (touch-based), Kinetic Scroll (gesture-based), and Fingerprint Scroll (our newly introduced sensor-based method). The second study compared the accuracy and speed of three zoom methods. One method was GyroZoom which uses the mobile phone's gyroscope sensor. The second one is Pinch-to-Zoom (Gesture-based) method. VolumeZoom, the third method, uses volume buttons that were reprogrammed to perform zoom operations. The third study on text entry compared a QWERTY-based onscreen keyboard with a novel 3D gesture-based Write-in-Air method. This method utilizes webcam sensors. Our key findings from the three experiments are that sensor-based interaction methods are intuitive and provide a better user experience than gesture-based interaction methods. The sensor-based methods were on par with the speed and accuracy of the gesture-based methods.
Open Access
Computer Vision for Hockey Video Curation
(2022-12-14) Pidaparthy, Hemanth; Elder, James
Computer vision-based models are being actively investigated for tasks such as ball and player tracking. These insights are useful for both coaches and players to improve performance. Applying computer vision-based solutions for hockey video analysis is challenging because of the small size of the puck, fast and non-smooth movement of the players, and frequent occlusions. In this thesis, I present my research work on computer vision for hockey video curation. I discuss three problems: 1) automatic sports videography, 2) play segmentation of hockey videos and 3) automatic homography estimation. When recording broadcast hockey videos, professional cameramen move a PTZ camera to follow the play. Professional videography is expensive for amateur games and this motivates the development of a low-cost solution for automatic hockey videography. We used a novel method for accurate ground truth of the puck location from wide-field video. We trained a novel deep network regressor to estimate the puck location on each frame. Centered around the predicted puck location, we dynamically cropped the wide-field video to generate a zoomed-in video following the play. The automatic videography system delivers continuous video over the entire game. Typical hockey games that feature 40-60 minutes of active play are played over 60-110 minutes with breaks in play due to warm-ups and fouls. We propose a novel solution for automatically identifying periods of play and no-play, and generate a temporally compressed video that is easier to watch. We combine visual cues from the output of a deep network classifier with auditory cues from the referee's whistle using a hidden Markov model (HMM). Since the PTZ parameters of the camera are constantly varying when recording broadcast hockey videos, the homography changes every frame. Knowing this homography allows for the projection of graphics onto the ice surface. We estimate the homography by exploiting the consistency of colours used for markings in ice hockey. We model the colours as a multi-variate Gaussian and then use a two-step approach to search for the homography that aligns the colours of the template image to that of a test image.
Open Access
Design Justice Principles and Do-It-Yourself Assistive Technology: Case Study
(2022-03-03) Poustizadeh, Mana; Baljko, Melanie
In this project, we focus on the Principles of Design Justice, as developed by the Design Justice Network, a community committed to challenging structural inequalities of design. Our thesis research project is aligned with the premise of user-centered design and the situated knowledge in third paradigm of HCI. We examine some of the current processes for Do-It-Yourself Assistive Technology (DIY-AT) development and deployment using the works of Makers Making Change (MMC). MMC connects the makers of DIY-AT devices to people who need AT devices. We also examine the impacts of the ongoing COVID-19 pandemic on the need for DIY-AT and the challenges it might have caused. Our findings include MMC's positive impact regarding DIY-AT service delivery, engaging local makers into making DIY-AT, and a modest job in integrating Design Justice Principles. The findings of our study also suggest an increase in the demand for AT due to the pandemic.
Open Access
Efficient Mining of Active Components in a Network of Time Series
(2022-08-08) Shafieesabet, Mahta; Papangelis, Emmanouil
Let a network of time series be a set of nodes assuming an underlying network structure, where each node is associated with a discrete time series. The road network, the human brain, online social media are a few examples of domain-specific applications that can be modelled as networks of time series. Now assume that the sequence of time series data points observed on a node determines whether the node is on (active) or off (inactive). Then, at each time step, a set of induced subgraphs can be formed from the subset of active nodes; we call these induced subgraphs active components. In this research, our goal is to efficiently detect and maintain/report the active components over time.
Open Access
Enabling Accessible Charts Through Interactive Natural Language Interface for People with Visual Impairments
(2023-03-28) Alam, Md Zubair Ibne; Prince, Enamul Hoque
Web-based data visualizations have become very popular for exploring data and communicating insights. Newspapers, journals, and reports regularly publish visualizations to tell compelling stories with data. Unfortunately, most visualizations are inaccessible to readers with visual impairments. For many charts on the web, there are no accompanying alternative (alt) texts, and even if such texts exist they do not adequately describe important insights from charts. To address the problem, we first interviewed 15 blind users to understand their challenges and requirements for reading data visualizations. Based on the insights from these interviews, we developed \seechart, an interactive tool that automatically deconstructs charts from web pages and then converts them to accessible visualizations for blind people by enabling them to hear the chart summary as well as to interact through data points using the keyboard. Our evaluation with 14 blind participants suggests the efficacy of SeeChart in understanding key insights from charts and fulfilling their information needs while reducing their required time and cognitive burden.
Open Access
Enriching Word Representation Learning for Affect Detection and Affect-Aware Recommendations
(2021-03-08) Dehaki, Nastaran Babanejad; An, Aijun; Papangelis, Emmanouil
The role of detecting affects from text is to detect affective states such as mood, sentiment and emotions from textual data. The main affective tasks, including sentiment analysis, emotion classification and sarcasm detection have been popular in recent years due to a broad range of relevant applications in various domains. Traditionally, recommendations deal with applications having only two types of entities, users and items, and do not put them into a context when providing recommendations. Recently, contextual recommendations provide more accurate recommendation by considering more contextual information. However, little attention has been paid to affective context and its relation to recommendations. In this dissertation, we first investigate the impact of using affective information on the quality of recommendations, and then seek to improve affect detection in text by enhancing word representation learning. We enrich word representations in two ways: one by effective pre-processing of training word embeddings and second by incorporating both affective and contextual features deeply into text representations. We demonstrate the benefits of enriched word representations in both affect detection and affect-aware recommendation tasks. This dissertation consists of five contributions. First, we investigate whether, and to what extent emotion features can improve recommendations. Towards that end, we derive a number of emotion features that can be attributed to both items/users in the domain of news and music. Then, we devise state-of-the-art emotion-aware recommendation models by systematically leveraging these features. Second, we study the problem of pre-processing in word representation learning for affective tasks. Most early models of affective tasks employed pre-trained word embedding. While pre-processing in affective systems is well-studied, text pre-processing for training word embeddings is not. To address this limitation, we conduct a comprehensive analysis of the role of text pre-processing techniques in word representation learning for affective analysis by applying each pre-processing technique first at embedding training phase, commonly ignored in pre-trained word vector models, then at the downstream task phase. Third, we investigate the usefulness of customized pre-processing for word representation learning for affective tasks. We argue that using numerous text pre-processing techniques at once as a general combination for all affective tasks decreases the performance of affect detection. Therefore, we conduct extensive experiments, showing that, an appropriate combination of text pre-processing methods for each affective task can significantly enhance the classifiers performance. The fourth contribution is to study the role of affective and contextual embeddings with deep neural network models for affect detection. Early word embedding methods, such as Word2vec, are non-contextual, meaning that a word has the same embedding vector independent of its context and sense. Contextual embedding techniques, such as BERT solve this problem but do not incorporate affect information in their word representations. We propose two novel deep neural network models that extend BERT to incorporate both affective and contextual features in text representations. Lastly, we show the usefulness of our proposed affective and contextual embedding models by applying them to affect-aware recommendations.
Open Access
Ensuring Fairness Despite Differences in Environment
(2021-07-06) Singh, Karan Deep; Edmonds, Jeff; Urner, Ruth
Several fairness definitions have been proposed in the machine learning literature to rectify the issue of demographic groups being treated differently. Given the substantial research in the field, this work aims to provide an entry-level overview of the common definitions and metrics that are essential for a novice reader in the field. In addition, we propose a theorem, where we look at different population distributions and conditions under which our claim holds, that is the disadvantaged individual is expected to be more talented than the similarly performing advantaged individual. Finally, this work summarizes the six research works and discusses whether the result of our theorem is consistent in each of the research work's model settings, culminating in a discussion of how all the authors view the world in terms of a group's talent distribution.
Open Access
Evaluating and Forecasting the Operational Performance of Road Intersections
(2022-12-14) NematiChari, Ali; Papangelis, Emmanouil
Road intersections represent one of the most complex configurations encountered when traversing road networks. It is therefore of vital importance to improve their operational performance, as that can significantly contribute towards the efficiency of the whole transport network. Traditional approaches to improve the efficiency of intersections are based on analysis of static data or expert opinions. However, due to the advancements on Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communication technologies, it is possible to enhance safety and improve road intersection efficiency by continuously monitoring traffic conditions and enabling situational awareness of vehicle drivers. Towards this end, we design, develop and evaluate a system for evaluating and forecasting the operational performance of road intersections by mining streams of V2I data. Our system makes use of graph mining and trajectory data mining methods to continuously evaluate a set of well-defined measures of effectiveness (MOEs) for traffic operations at different levels of road network abstraction. In addition, the system enables interactive analysis and exploration of the various MOEs. The system architecture and methods are general and can be used in various settings requiring continuous monitoring and/or forecasting of the road network state.
Open Access
Exploiting Reward Machines with Deep Reinforcement Learning in Continuous Action Domains
(2023-03-28) Sun, Haolin; Lesperance, Yves
Deep reinforcement learning can solve real-world robot control problems, such as autonomous driving and robotic arm manipulation. In deep reinforcement learning, an agent does not know the problem description and learns the optimal solution through trial-and-error. This method brings two major challenges when solving real-world problems: partial observability and learning efficiency. In this thesis, we address these two challenges and extend previous work. First, we use reward machines to address the problem of partial observability. Then, we focus on finding the existing cutting-edge deep reinforcement learning algorithms and integrating them with reward machines to enhance the learning efficiency. To test the performance of all the algorithms, we proposed a series of different tasks that can be used to mimic real-world robot control problems. Finally, based on the test results, we compare the performance of all the algorithms and analyze their advantages and disadvantages.
Open Access
Exploring Low-Dimensional Structures in Images Using Deep Fourier Machines
(2023-03-28) Asadi, Behnam; Jiang, Hui
The ground-breaking results achieved by Deep Generative Models, when given merely a dataset representing the desired distribution of generated images have caught the interest of scholars. In this work, we introduce a novel structure designed for image generation utilizing the idea behind Fourier Series and Deep Learning function composition. By composing low-dimensional structures, we will first compress a high-dimensional image, and then we will use this latent space to generate fake images. Our compression algorithm gives comparable results to the JPEG algorithm and even, in some cases, outperforms it. Also, our image generation model can generate decent fake images on MNIST and CIFAR-10 datasets and can surpass the first generation of Variational Autoencoders.
Open Access
Exploring the Impact of Immersion on Situational Awareness and Trust in Remotely Monitored Maritime Autonomous Surface Ships
(2023-03-28) Gregor, Alexander William Heinz; Allison, Robert
This thesis examines how Situational Awareness (SA) and Trust, along with some exploratory variables, were affected by different immersion levels in maritime remote monitoring. To examine this a simulated Shore Control Centre (SCC) interface for Maritime Autonomous Surface Ships (MASS) was constructed, which had an autonomous container ship traversing the arctic with robotic aids. Three query sets were asked per simulation run, which facilitated tracking how SA, Trust, and Motion Sickness (MS) evolved over time. Three different virtual reality (VR) interfaces were used; Non-Immersive VR (NVR), Semi-immersive VR (SVR), and Immersive VR (IVR). The simulation and query sets were performed on a counterbalanced within-subjects user study with 39 participants. The results illustrated various trade-offs - with NVR showing higher user preference, SVR showing signs of higher SA, and IVR showing moderate Trust but increased MS. Understanding these trade-offs between immersion levels is a requisite step for designing future SCCs.
Open Access
Exploring the Scalability, Throughput and Security Characteristics of the Tangle Distributed Ledger Technology through Simulation Analysis
(2021-03-08) Madenouei, Nahid Alimohammadi; Liaskos, Sotirios
During the past few years, blockchain technologies have attracted substantial attention from researchers, engineers, and enterprises. These technologies provide decentralized platforms to validate different types of transactions without relying on a central authority. Since the advent of the popular Bitcoin network [2], [3], various types of protocols, among them DAG-based distributed ledgers, have been offered to improve the Bitcoin networks shortcomings in the areas of scalability, throughput, and security. Researchers have also been motivated to study blockchain-based applications across multiple domains with novel designs and capabilities [4]. However, when used in such capacities, failures of blockchain networks (BNs) imply catastrophes that extend beyond individuals, organizations, and countries [5]. In light of such mission criticality, the vision of turning BNs into decentralized transaction processing systems that can sustainably and securely compete with the centralized state of the art poses substantial methodological challenges for researchers [5]. Before considered for wide adoption, BN protocols and technologies must be assessed with different analytical and empirical methodologies. In this research, a novel DAG-based distributed ledger, the Tangle, has been studied and several of its features have been investigated through simulation analysis. Contributions of this research are two-fold. First, a Tangle simulator has been designed and implemented based on model-oriented design and development principles. Second, three important characteristics of Tangle networks including scalability, throughput, and their security against double-spending attacks have been investigated.
Open Access
Fan-Fiction and AO3 Free-Form Tagging Practice: Innovating Open-Source Tools for Tag Network Visualization and Analysis
(2022-12-14) Bestard Lorigados, Elias; Baljko, Melanie
The thesis focuses on obtaining, representing, analyzing, and visualizing tag data within the Archive of Our Own (AO3) fan fiction archive, with a focus on the ``Additional Tags" category. The work was undertaken in four components. First, network nomenclature was defined and formally-defined network representations were developed for the AO3 Additional Tag structure. Second, computational techniques were developed to scrape the data from AO3, both in its current state and in its archival state (from the Wayback Machine) and to store it in a format suitable for subsequent processing. Third, techniques were developed to use the scraped data in order to computationally instantiate and visualize the network structures that were developed in the first component. Finally, in the fourth component, a set of network analysis techniques were developed, which were used to extract quantitative characteristics of the networks, to identify relevant changes over time, and to identify co-creation actions between different users in the AO3 community.
Open Access
Fast Similarity Graph Construction via Data Sketching Techniques
(2022-03-03) Marefat. Hoorieh; An, Aijun; Papangelis, Manos
Graphs are mathematical structures used to model objects and their pairwise relationships. Due to their simple but expressive abstract representation, they are commonly used to model various types of relations and processes in technological, social or biological systems and have found numerous applications. A special type of graph is the similarity graph in which nodes represent entities and there is an edge connecting two nodes if the two entities are similar based on some similarity measure. In a typical scenario, raw data of entities are provided in the form of a relational dataset, matrix or a tensor and a similarity graph is built to facilitate graph-based analysis like node importance, node classification, link prediction, community detection, outlier detection, and more. The ability to construct similarity graphs fast is important and with a potential for high impact, thus several approximation techniques have been proposed. In this work, we propose data sketching based methods for fast approximate similarity graph construction. Data sketching techniques are applied on the raw data and are designed to achieve desired error guarantees. They can drastically reduce the size of raw data on which we operate, allowing for faster construction and analysis of similarity graphs, but with approximate results. This is a desirable tradeoff for many applications in diverse domains. Through a thorough experimental evaluation, we demonstrate that our sketching methods outperform sensible baselines and competitor methods proposed for the problem. First, they are much faster than exact methods while maintaining high accuracy in constructing the similarity graph. Furthermore, our methods demonstrate significantly higher accuracy than competitive methods on generic graph analysis tasks. We demonstrate the effectiveness of our methods on different real-world graph applications.
Open Access
Generative Adversarial Network (GAN) for Medical Image Synthesis and Augmentation
(2023-03-28) Liang, Zhaohui; Huang, Jimmy
Medical image processing aided by artificial intelligence (AI) and machine learning (ML) significantly improves medical diagnosis and decision making. However, the difficulty to access well-annotated medical images becomes one of the main constraints on further improving this technology. Generative adversarial network (GAN) is a DNN framework for data synthetization, which provides a practical solution for medical image augmentation and translation. In this study, we first perform a quantitative survey on the published studies on GAN for medical image processing since 2017. Then a novel adaptive cycle-consistent adversarial network (Ad CycleGAN) is proposed. We respectively use a malaria blood cell dataset (19,578 images) and a COVID-19 chest X-ray dataset (2,347 images) to test the new Ad CycleGAN. The quantitative metrics include mean squared error (MSE), root mean squared error (RMSE), peak signal-to-noise ratio (PSNR), universal image quality index (UIQI), spatial correlation coefficient (SCC), spectral angle mapper (SAM), visual information fidelity (VIF), Frechet inception distance (FID), and the classification accuracy of the synthetic images. The CycleGAN and variant autoencoder (VAE) are also implemented and evaluated as comparison. The experiment results on malaria blood cell images indicate that the Ad CycleGAN generates more valid images compared to CycleGAN or VAE. The synthetic images by Ad CycleGAN or CycleGAN have better quality than those by VAE. The synthetic images by Ad CycleGAN have the highest accuracy of 99.61%. In the experiment on COVID-19 chest X-ray, the synthetic images by Ad CycleGAN or CycleGAN have higher quality than those generated by variant autoencoder (VAE). However, the synthetic images generated through the homogenous image augmentation process have better quality than those synthesized through the image translation process. The synthetic images by Ad CycleGAN have higher accuracy of 95.31% compared to the accuracy of the images by CycleGAN of 93.75%. In conclusion, the proposed Ad CycleGAN provides a new path to synthesize medical images with desired diagnostic or pathological patterns. It is considered a new approach of conditional GAN with effective control power upon the synthetic image domain. The findings offer a new path to improve the deep neural network performance in medical image processing.