Rapid developments in computer vision technologies have been transforming many traditional fields in engineering and science in the last few decades, especially in terms of diagnosing problems from visual images. Leveraging computer vision technologies to inspect, monitor, assess infrastructure conditions, and analyze traffic dynamics, has gained significant increase in both effectiveness and efficiency, compared to the cost of traditional instrumentation arrays to monitor, and manually inspect civil infrastructures and traffic conditions. Therefore, to construct the next-generation intelligent civil and transportation infrastructures, this dissertation develops a comprehensive computer-vision based sensing and fusion framework for structural health monitoring and intelligent transportation systems.

First, the dissertation presents a context-aware deep convolutional semantic segmentation network to effectively detect concrete cracks in structural infrastructures under various conditions. Specially, a pixel-wise deep semantic segmentation network is applied to segment the cracks on images with arbitrary sizes without retraining the prediction network. Moreover, a context-aware fusion algorithm that leverages local cross-state and cross-space constraints is proposed to fuse the predictions of image patches. Compared with normal deep convolutional semantic segmentation network, this proposed method supports training the network using training samples with different sizes, and also achieves promising generalizations when the number of training samples is limited. In the testing phase, this proposed method advances the state-of-the-art performance of Boundary F1 (BF) score by an average of 2.77% in all three concrete crack datasets used in the experiments, e.g., CrackForest Dataset (CFD), Tomorrows Road Infrastructure Monitoring, Management Dataset (TRIMMD), and Customized Field Test Dataset (CFTD).

Second, this dissertation presents a hybrid inertial vision-based displacement measurement system that can measure three-dimensional structural displacements of civil infrastructures using a monocular charge-coupled device camera, a stationary calibration target, and an attached tilt sensor. The system does not require the camera to be stationary during the measurements, and the camera movements, i.e., rotations and translations, during the measurement are compensated by using a stationary calibration target in the field of view of the camera. An attached tilt sensor is further used to refine the camera movement compensation, and better infer the global three-dimensional structural displacements. Specially, this proposed system with attached tilt sensor achieves an average of 1.440 mm Root Mean Square Error (RMSE) on the in-plane structural translations and an average of 2.904 mm RMSE on the out-of-plane structural translations.

Finally, this dissertation presents a context-aware traffic surveillance system that integrates sensor information from autonomous vehicles to improve performance of night time vehicle detection and tracking. The sensor information is considered as low-rate contexts for recording relative vehicle distances and orientations of each autonomous vehicle to its neighboring vehicles. The key elements of the proposed method include a vehicle pairing framework that represents vehicles based on this low-rate sensor information and the detected vehicle taillights. Experiments are conducted on real traffic videos and the proposed system attains 0.6319 in Multiple Object Tracking Accuracy (MOTA), which represents a 26.1% increase over the state-of-the-art systems.

As an extension of traffic surveillance system that integrates sensor information from autonomous vehicles, this dissertation proposes the first night time framework that combines the vehicle headlights and taillights to localize the vehicle contours. This framework includes a novel multi-camera vehicle representation that groups and reconstructs vehicle headlights and taillights following mutual geometric distances between different vehicle components. The vehicle contour representation successfully removes duplicated vehicle lights and also compensates for the missing vehicle lights in the detection process. Vehicle headlight alignment and contour adjustment are used to further refine the vehicle contours. The proposed multi-camera system considers typical four-wheel vehicles, e.g., cars and SUVs, in the monitoring and might not be able to handle large trucks. The experiments are conducted on night time traffic videos under various scenarios and the proposed system attains an average of 0.896 in Multiple Object Tracking Accuracy (MOTA) and an average of 0.904 in Jaccard Coefficient (JC), which indicates 19.2% and 15.9% increases over the state-of-the-art approaches.

As mentioned above, comprehensive numerical experiments conducted on simulated and on-field environments have shown the context-aware deep convolutional semantic segmentation network, hybrid inertial vision-based structural displacement measurement system and intelligent night time traffic surveillance system all perform well in practical structural health monitoring and intelligent transportation applications. Although the proposed framework has already achieved promising performance at present, placing physical calibration targets to a vantage point before monitoring process and handling the limited lighting conditions at night still remains as challenges.

Therefore, the future research directions of the proposed sensing and fusion framework for structural health monitoring and intelligent transportation systems at night are listed as:

  • Design a target-free three-dimensional structural displacement measurement system.
  • Design a drone-based mobile system for structural health monitoring.
  • Design an intelligent vision-based system using other types of sensors, e.g., LiDAR, Time-of-Flight sensors.

As a result, the future directions of the proposed framework in this dissertation would be extended to an intelligent multi-modal sensing system, where multiple cutting-edge technologies including computer vision, sensor fusion, robotics, and wireless communications are leveraged.

Degree Date

Spring 5-15-2021

Document Type


Degree Name



Electrical and Computer Engineering


Dr. Dinesh Rajan

Subject Area

Civil Engineering, Computer Science, Electrical, Electronics Engineering


Computer Vision, Machine Learning, Structural Health Monitoring, Intelligent Transportation System, Image Processing, Video Processing, Sensor Fusion

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License