Google’s AI Research: Part 3
This article is particularly important in the wake of the earthquakes that recently occurred in Gaziantep, Turkey, and Syria. We’ve made use of satellite imagery for disaster relief for a few years now, going as far back as 2010 with the earthquake in Haiti. It had a limited impact there because of the lengthy, manual process that was required to identify and label damaged structures – this meant it could take days before researchers were able to find people who needed help the most.
Using machine learning models we can detect changes in buildings from satellite images. In step one we take our satellite image, and apply bounding boxes to determine where the boundaries of the buildings actually are. Next we compare the before and after photos.
Both images need to be put through a convolutional neural network (CNN) which will compare the two pictures and output a score between 0 and 1, which relates to the amount of change in the image.
One issue in developing the algorithm is that satellite images are not always taken by the same satellite, and definitely not always taken under the same conditions (time of day, weather, etc). This means there needs to be some standardization to the images, and researchers have come up with a pipeline of actions to transform the data into a uniform, useable dataset.
Obviously not having any natural disasters happening to observe is a good thing for humanity, but if you’re trying to collect a lot of data it can make your job difficult. To get a suitably large dataset researchers had to look through previous natural disasters and collaborate with experts to manually label huge datasets of images. In some cases Google will outsource this work to everyone (think of captchas where you are prompted to select all squares that contain a motorcycle), but for this project they only wanted to have experts with previous experience.
Next the model has to make some prediction on new data, based on what it is has learned from previous events. In this instance we can look at a confusion matrix to get a better idea of the possible outcomes:
Four possible outcomes for the prediction here:
The AI predicts the building is destroyed, and it is destroyed: True Positive
The AI predicts the building is destroyed, but it’s not: False Positive
The AI predicts the building is not destroyed, but it is destroyed: False Negative
The AI predicts the building is not destroyed, and it’s not: True Negative
The model goes through some data and makes a series of predictions, and then the true labels for the data are revealed (destroyed vs not-destroyed) and you can get an idea of what the accuracy of the model is:
Note these ROC values don’t correspond to the prediction accuracies – they measure the accuracy based on a very different scale.
A higher “prediction” score means that the building has sustained some damage while a lower one indicates it’s unscathed. Using large scale machine learning models can help disaster relief efforts - and allow search and rescue teams to better plan their rescue efforts.