Research Projects (As of Feb 2022)
Driving Behaviour Study using Human-Machine Interactions
(a) Driving Behaviour Study in Indian Traffic Scenario using Bus Driving Simulator (Ongoing)
(b) Understanding Drivers' Stress and Interactions with Vehicle Systems
Today, and probably for a long time to come, humans will remain an integral part of vehicles for driving tasks. Therefore, it is essential to understand how vehicles and drivers interact with each other and how drivers' behavior and physical and mental states affect vehicle performance and traffic safety. In this work, the relationship between driver and vehicle was explored in real-world driving conditions by analyzing large-scale naturalistic data collected from cars and drivers. We analyzed different types of driver-vehicle interactions during driving, investigated the effect of different driving conditions on drivers' stress, and explored the relationship between driver and vehicle in different driving conditions. The findings from this work could be used to help manage comfort-related in-vehicle intervention systems and could provide a continuous measure of how different external conditions (traffic, road, weather, etc.) affect drivers.
Milardo, S., Rathore, P., Santi, P., Ratti C. “Understanding Drivers’ Stress and Interactions with Vehicle Systems Through Naturalistic Data Analysis", IEEE International Transactions on Intelligent Transportation Systems, 2021
Driving Behaviour Analysis
(a) Driving Behaviour Analysis of Professional Bus Drivers in Indian Traffic Scenario (Ongoing)
(b) Driving behaviour classification: (i) An unsupervised (Explainable) approach; and (ii) Supervised (AutoML-based) approach
Traditional driving behaviour recognition algorithms leverage hand-crafted features extracted from raw driving data, and then apply user-defined machine learning models to identify driving behaviours. However, such solutions are limited by the set of selected features and by the chosen model, requiring extensive knowledge of the analyzed signals to perform reasonably.
In this work, two data-driven driving behaviour recognition frameworks are developed for professional drivers based on (i) a simple yet efficient, unsupervised, aggregation based approach, (ii) automatic feature extraction and feature selection algorithm and a deep neural network architecture obtained using an Automated Machine Learning (AutoML) approach.
(i) Unsupervised (Explainable ML) approach
(ii) Supervised (AutoML) approach
Fake Reviewer Groups Detection from Digital Market
Online product reviews have become increasingly important in digital consumer markets where they play a crucial role in making purchasing decisions by most consumers. Unfortunately, spammers often take advantage of online reviews by writing fake reviews to promote/demote certain products. Most of the previous studies have focused on detecting fake reviews and individual fake reviewer-ids. However, to target a particular product, fake reviewers work collaboratively in groups and/or create multiple fake ids to write reviews and control the sentiments of the product. This work addresses the problem of detecting such fake reviewer groups.
We proposed a top-down framework for candidate fake reviewer groups’ detection based on the DeepWalk approach on reviewers’ graph data and a (modified) semisupervised clustering method, which can incorporate partial background knowledge.
Our experimental results demonstrated that the proposed approach is able to identify the candidate spammer groups with reasonable accuracy. This approach can also be extended to detect groups of opinion spammers in social media (e.g. fake comments or fake postings) with temporal affinity, semantic characteristics, and sentiment analysis.
Big Data Cluster Analysis
Everyday, an abundant amount of data is generated from various sources such as IoT networks, smartphones, and social network activities. Making sense of such an unprecedented amount of data is essential for many businesses, services, and applications. Currently, there is little domain expertise to automate this big data analysis, and traditional supervised machine learning techniques suffer from a lack of labelled training data in this context. The aim is to develop scalable and efficient unsupervised algorithms to manage and extract actionable information from big data.
Cluster analysis is a useful unsupervised approach to discover the underlying groups and useful patterns in the data. Cluster Analysis for any data consists of three problems, (P1) cluster assessment, which asks “Do the data have clusters? If yes, how many?"; (P2) Clustering i.e., partitioning the data into clusters, and (P3) cluster validity, which asks “Are the clusters found useful? Is there a better one we did not find?" Traditional cluster analysis algorithms are not suitable for big data owing to its volume, variety, and velocity property.
A suite of novel scalable algorithms were developed to solve each of the three problems of cluster analysis, namely, cluster assessment, clustering, and cluster validity, for big data, that may be high-dimensional, anomalous and streaming.
Rathore, P., Ghafoori, Z., Bezdek, J. C., Palaniswami, M., Leckie, C. (2018). Approximating Dunn's cluster validity indices for partitions of big data. IEEE Transactions on Cybernetics, (99), 1-13.
Rathore, P., Kumar, D., Bezdek, J. C., Rajasegarar, S., Palaniswami, M. (2018). A rapid hybrid clustering algorithm for large volumes of high dimensional data. IEEE Transactions on Knowledge and Data Engineering, 31(4), 641-654.
Rathore, P., Bezdek, J. C., Erfani, S. M., Rajasegarar, S., Palaniswami, M. (2017). Ensemble fuzzy clustering using cumulative aggregation on random projections. IEEE Transactions on Fuzzy Systems, 26(3), 1510-1524.
Structure Assessment in Streaming Data
The widespread use of Internet of Things (IoT) technologies, smartphones, and social media services generate huge amounts of data streaming at high velocity. Automatic interpretation of these rapidly arriving data streams is required for the timely detection of interesting events that usually emerge in the form of clusters. At present, there is no technique on offer for visual assessment of evolving cluster structure in high-velocity, massive data streams.
Visual assessment of cluster tendency (VAT) model, which produces a record of structural evolution in the data stream by building a cluster heat map of the entire processing history in the stream. Existing VAT-based algorithms for streaming data are not suitable for high-velocity, high-volume streaming data because of high memory requirements and slower processing speed as the accumulated data increases. Scalable iVAT (siVAT) algorithm can handle big batch data, but for streaming data, it needs to be (re)applied everytime a new datapoint arrives, which is not feasible due to associated computation complexities. The aim is to develop an online algorithm for tracking of evolving cluster structures in high-velocity, high dimensional data streams. An incremental version of scalable iVAT algorithm is developed for change detection and structural assessment in high-velocity data streams.
Rathore P., Kumar D., Bezdek J. C., Rajasegarar S., Palaniswami M. (2020) Visual Structure Assessment and Anomaly Detection for High-Velocity Data Streams", (minor revision) in IEEE Transactions on Cybernetics.
The developed algorithm is illustrated with a 2D synthetic dataset which evolves significantly over time. The developed algorithm produces reordered dissimilarity image or cluster heat map (a square digital image) for cluster assessment, which is updated after every new chunk of a pre-specified number of datapoints. The intensity of each pixel in an RDI reflects the dissimilarity between the corresponding row and column objects. A "useful" RDI highlights potential clusters as a set of "dark blocks" along the diagonal of the image. This video demonstrates the algorithm’s ability to visualize changing cluster structure in streaming data.
This video shows the sequence of RDIs after every new chunk of KDD Cup'99 (simulated) data streams.
Large-Scale Trajectory Prediction
The widespread use of global positioning system (GPS) navigation systems and wireless communication technology-enabled vehicles have resulted in huge volumes of spatio-temporal data, especially in the form of trajectories. These data often contain a great deal of information, which give rise to many location-based services (LBSs) and applications such as vehicle navigation, traffic management, and location-based recommendations. One key operation in such applications is the route prediction of moving objects. Vehicle route prediction allows certain services to improve their quality, e.g., if the route of vehicles is known in advance, intelligent transportation systems (ITSs) can provide route-specific traffic information to drivers such as forecasting traffic conditions and routing the driver to avoid traffic jams.
Most trajectory prediction approaches in the literature use only synthetic or small to medium size real trajectory datasets because they are not scalable. The aim is to develop a framework for large-scale trajectory prediction which can be used for road networks of major cities. A scalable framework was developed for short-term and long-term trajectory prediction, based on our novel big data clustering algorithm and Markov models, which can utilize a large number (in millions) of trajectories in a dense road network. The developed framework was tested on two real-life, large-scale, taxi trajectory datasets from the Beijing and Singapore road networks in our experiments.
Rathore P., Kumar D., Rajasegarar S., Palaniswami M, Bezdek J. C. (2019) ,"A Scalable Framework for Trajectory Prediction," in IEEE Transactions on Intelligent Transportation Systems.
This figure presents the main steps of our trajectory prediction framework for large-scale trajectory data. For more details, please refer to the publication (mentioned above).
Sensor Measurements Estimation in Internet of Things
In modern IoT deployments for continuous monitoring applications, many inexpensive sensors along with a relatively few expensive high-precision sensors are used to reduce deployment costs. Generally, the low-cost, low-precision sensor nodes have limited memory and processing power. Most techniques for sensor drift detection are not suitable for modern IoT deployment as they do not consider measurement errors/uncertainties present in low-precision sensor measurements. We developed an automatic sensor drift detection and correction technique by leveraging a Bayesian maximum entropy-based estimation method that incorporates measurement errors/uncertainties of low-precision sensors to estimate drift, with Kalman filtering to track and correct the estimated drift from sensor measurements. We implemented the proposed technique in both centralized and distributed frameworks to facilitate in-network sensor drift detection/correction in real-time. We extended this work for both smooth and abrupt drift detection using Interacting Multiple Models. The following figures present the block diagram of the developed sensor drift detection/correction method and demonstrate how it automatically detects/corrects the sensor drift (from Dockland Deployment) in a distributed manner.
Rathore, P., Kumar, D., Rajasegarar, S., Palaniswami, M. (2017). Maximum entropy-based auto drift correction using high-and low-precision sensors. ACM Transactions on Sensor Networks (TOSN), 13(3), 24.
Rathore, P., Kumar, D., Rajasegarar, S., & Palaniswami, M. (2018, February). Bayesian maximum entropy and interacting multiple model based automatic sensor drift detection and correction in an IoT environment, in IEEE 4th World Forum on Internet of Things (WF-IoT) (pp. 598-603).
Internet of Things for Urban Microclimate Monitoring
As part of the 'Internet of Things (IoT) for Creating Smart Cities' project, the University of Melbourne, ARUP and the City of Melbourne council (CoM) have done a pilot deployment of IoT networks in Melbourne city for monitoring environmental parameters. One of the aims of the research was to develop new systems and algorithms that can help City administrators remotely monitor, understand and interpret real-time information on urban environments. The environmental sensors, measuring light levels, humidity and temperature, were deployed at Fitzroy Gardens and Library at the Docklands. The data collected was used and analyzed to better understand the impact of canopy cover on urban cooling.
A subset of data from this deployment can be downloaded from here. Please cite the following two papers when using the data:
Gubbi J., Buyya R., Marusic M., Palaniswami M. Internet of Things (IoT): A Vision, Architectural Elements, and Future Directions, Future Generation Computer Systems, Volume 29, No. 7, Pages: 1645-1660, Elsevier Science, Amsterdam, The Netherlands, 2013.
Rathore, P., Rao, A. S., Rajasegarar, S., Vanz, E., Gubbi, J., Palaniswami, M. (2017). Real-time urban microclimate analysis using Internet of Things. IEEE Internet of Things Journal, 5(2), 500-511.
Geovisualization Tools for Monitoring Real-Time Data : An interactive geovisualization framework, using the synergy of two computational intelligence methods, was developed to analyze complex patterns of urban microclimate data, obtained from our IoT deployment. This framework facilitates spatio-temporal estimation and anomaly detection, in an interactive manner, to obtain a detailed spatio-temporal map, and subsequently, to identify unusual patterns from environmental data. The Melbourne city council's urban forest team uses this framework for urban microclimate data analysis.
Rathore, P., Rao, A. S., Rajasegarar, S., Vanz, E., Gubbi, J., Palaniswami, M. (2017). Real-time urban microclimate analysis using internet of things. IEEE Internet of Things Journal, 5(2), 500-511.
The below three links provide video demonstrations of our early visualization application created to view and analyse the real time data collected from the above deployment.
Visualization-1: In this visualization video, information about the tree species, canopy cover percentage and the numeric value of sensor readings (humidity, temperature and light) are displayed. This helps to relate different canopy cover and tree species with the climate parameters. Canopy coverage is also shown using a shadow of the tree that changes its width/size depending on the percentage of coverage. A sliding bar is included, for the user, to vary the data/time manually and visualize the micro-climate at that instance. Click on the image to watch the video.
Visualization-2: Air temperature and relative humidity are also affected by wind speed and direction. More sensor nodes can also be deployed incorporating additional sensors for measurement of wind speed and direction. This video shows the visualization of real-time wind speed and direction measurements (tree leaves movements direction and speed) that may contribute better to simulate the tree microclimate in advanced level. Click on the image to watch the video.