A Smart City Research Testbed for Improving City Quality of Life Through City Scale Computing
Platform Pittsburgh is a collaborative effort between researchers at Carnegie Mellon University, the City of Pittsburgh, and other partners with the goal of creating a "living laboratory" for conducting smart city research and analytics. As Pittsburgh continues to adopt state of the art technology, this project harnesses the power of edge computing and visual data. The goal is to lay the groundwork for a testbed in order to develop applications and produce statistics towards improving quality of life. We hope to foster interdiscplinary collaborations along with feedback from the community to solve real-world challenges in transportation mobility and safety.
An integrated approach will be taken that uses computer vision, machine learning, and simulations to produce data to city planners, traffic engineers, and decision makers. Research will be directed towards applications that improve efficiency, health,safety, and overall quality-of-life, such as:
Transportation and City Dynamics
Notable "event" counting: bike near bus, near collisions, pedestrian unexpectedly entering street
Detailed statistics of human and vehicle behavior at intersections
External validation of autonomous vehicle positioning/decision making
Open parking space identification
Climate and Environmental Monitoring
Air-quality estimation from video data
Per-vehicle pollution estimation (based on identification of exhaust)
Road condition monitoring
Infrastructure to Vehicle Communication
3D localization and reconstruction
Intersection statistics and behavior
Smarter transportation (autonomous and connected vehicles)
Establish Responsible Privacy and Data Usage Policy
In addition to creating new algorithms and software for civic use, one of the outputs is to establish responsible guidelines for capture, use, and retention of urban video data.
Throughout the project, we will be working with privacy, public policy, and ethics experts at CMU and in the community to establish these guidelines.
In addition to creating new algorithms and software for civic use, one of the outputs of the CMU Urban Data Analytics Testbed is to establish responsible guidelines for capture, use, and retention of urban video data. Throughout the project, we will be working with privacy and public policy experts at CMU to establish these guidelines.
We aim to build a scalable platform for video analytics. It will serve as a test bed for computer vision research and cloud-to-edge computing systems. Main target applications include object detection (pedestrians, vehicles, near-miss, etc.) and air quality (pollution due to exhaust, construction, smoke, industrial output detection, etc.).
Compute boxes contain at least an NVidia Tegra TX-1 or TX-2, or an Intel NUC. In some instances, both compute units are used as a heterogenous system. Data can be stored locally on hard drives. Each deployment has at least one 4 megapixel or 12 megapixel Gigabit ethernet camera with high quality lens optics. Images are captured as raw data to maximize quality for best algorithm performance.
Real-time video ingestion and analytics platform are performed with Streamer. A C++ platform supports Caffe and tensorflow backends. A pipeline-based programming model defines complex operations on multiple camera streams. Support is provided for multiple camera operation modes.
There are currently four deployments within the City of Pittsburgh, which includes a total of 18 cameras. Three deployments are on Carnegie Mellon University's campus and one deployment is at the intersection of 5th Ave and Craig Street. Cameras capture data necessary for algorithm development for object detection and air quality assessment.
Intersections are of primary interest to this project because approximately 2.5 million accidents occur at intersections, which accounts for nearly 40% of all nationwide accidents. Initial deployments will be at intersections around the Carnegie Mellon University campus in Oakland due to the large number of vehicular, pedestrian, and bicyclist traffic.
Each intersection will be instrumented with up to 8 cameras in order to capture a full view of the intersection. The number of cameras will vary according to the width of the intersection. Cameras are mounted on traffic signal poles in order to provide the best view of the intersection. A cabinet containing electronic equipment is mounted on nearby poles.
Detecting vehicles is a critical task to estimating CO2 emissions. We trained deep learning models using data collected from the 5th/Craig intersection. To train the models, labeled data was provided by partner Zensors. Zensors provides a web backend that ingests camera streams. Users submit questions to be answered abou the video stream and a cohort of people manually label video frames to answer the questions. The labeled data is then provided through an easy to use interface.
The user interface provided by Zensors. Video streams are listed along with
answers to questions. Interface can be expanded to view temporal data.
Initial algorithm developed for detecting moving objects within the intersection.
Detection of vehicles (green box) during the day time.
Detection and tracking of vehicles during the night time.
The color of the boxes are unique to each vehicle.
The generality of the detection algorithms was tested with Jackon Hole, Wyoming Street Cams.
Note that there are a lot of occlusions in this camera angle, yet
cars are still detected with accuracy.
In addition to detecting vehicles, they are tracked with estimations of travel direction.
The travel direction with respect to the camera is denoted in the upper
right corner of the video as a percentage of total vehicles.
Fast and accurate 3D reconstruction of multiple dynamic rigid objects (eg. vehicles) observed from wide-baseline, uncalibrated and unsynchronized cameras is challenging. On one hand, feature tracking works well within each view but is hard to correspond across multiple cameras with limited overlap in fields of view or due to occlusions. On the other hand, advances in deep learning have resulted in strong detectors that work across different viewpoints but are still not precise enough for triangulation based reconstruction. In this work, we develop a framework to fuse both the single-view feature tracks and multiview detected part locations to significantly improve the detection, localization and reconstruction of moving vehicles, even in the presence of strong occlusions. We demonstrate our framework at a busy traffic intersection by reconstructing over 40 vehicles passing within a 3-minute window. We evaluate the different components within our framework and compare to alternate approaches such as reconstruction using tracking-by-detection.
Using vehicle detection and emission models, air quality can be assessed from multiple video sources. The below videos show example captured videos and the 3D trajectory of detected vehicles.
Reconstruction of at 5th Ave and Craig Street Intersection
Traffic Analysis Software
In order to make the platform easy to use for researchers, we developed a software system running on a distributed system performing computer vision based traffic analysis with easy to control web interface. The front-end of the software allows a user to 1) record video from a selected camera, 2) play the video, 3) select algorithm to execute, and 4) view result of processing. The back-end is a modular framework that manages user interaction, video recording, display, processing, and progress tracking.
Computer Vision Algorithms were developed from scratch in C so that they would be optimized for the hardware and software stack. The algorithms that were developed include:
Contour detection and Clustering
3D pose estimation of vehicle
Frequently Asked Questions
What algorithms are currently running? At this time, the cameras are not always on. They are being turned on periodically to develop and test system software that distributes computer vision algorithms across servers located at CMU to processing nodes next to the cameras, as well as to test computer vision techniques for learning robust, camera-viewpoint specific pedestrian and vehicle detectors. We are running no analyses that attempt to recover personally identifiable patterns of individuals captured in the video.
Is audio being recorded?
Absolutely not. No audio will ever be recorded by the CMU Urban Video Analytics prototype. It is illegal to record audio in the state of Pennsylvania without consent from all parties.
Where does the data go, what is your data management policy?
Right now the project is in its initial hardware/software development stages. We are not constantly recording data. Data that is recorded by the cameras is stored on secure CMU servers managed by the PIs. We are working to establish a clear data-management policy that will be in place and documented in detail on this web site prior to commencing larger-scale, continuous capture deployments in the city blocks around CMU.
What types of analyses will be performed on the acquired video feeds?
The video streams captured by the cameras will be used to develop applications and systems that support urban computing concerns.
How will privacy be maintained? Only de-identified data will be made publicly available. For example, faces and license plates will be blurred. Most data will be stored for short periods of time on secure servers. There will not be any attempt to identify people, vehicles, etc.
City of Pittsburgh (Department of Mobility and Infrastructure)
Pennsylvania Department of Transportation
National Science Foundation
University Transportation Center
For further information or questions, contact us at platform-pgh _AT_ cs.cmu.edu (replate _AT_ with @).