I’ve noticed that many of the broad-scale analyses I run using Google Earth Engine frequently lead to error messages like “Too many pixels in the region” and “Computation timed out.” Often, the workaround for quick visualization is setting a scaling factor that doesn’t make computing requirements on Google’s end go nuts. That means, when we look at graphs through ui.Chart.image
everything is analyzed across a scaled version of the original raster product.
So, how do we get Google to process and export large RGB datasets and not run into scaling issues along the way? In this case, less is more, every time. Using ui.Chart
and Map.addLayer()
is where we run into problems. Those are great functions if you want to check your progress along the way, but once you know you have what you want, get rid of all the steps that don’t directly work into the data pipeline of finding imagery, processing imagery, and exporting imagery. Even if mapping steps don’t lead to errors, they do take a long time for large datasets.
Perhaps the single largest consideration you’ll make any time you work with raster datasets is the resolution of your data. In this case, we’re making plots for presentations. A 4k monitor or projector is 3180 pixels wide by 2160 pixels tall. So, there’s really no reason to use data at a finer scale than this if we’re only making a map to give our audience some context. That means, throughout the data pipeline below, I limit the maxpixels to 10 million (3180x2160 is just over 8 million pixels). Don’t bother downloading higher resolution data than that for your ROI if it’s going to end up in a slideshow - but note that for poster printing you may want finer data.
Let’s take Yolo County as an example. If I want to use a high-definition image of Yolo County in a presentation (and I do), the workflow should include:
Disclaimer: You need to be a registered Google Earth Engine user to do this. They have a request system that in my experience is very quick to respond, but be aware that it may take a few days to register.
Note: All code for this workflow can be found pre-compiled (and data pre-downloaded) at this link. Here’s a link to the full script in the Earth Engine code editor.
So, to take things step-by-step: First, we’ll be using javaScript - it was new to me when I started using Earth Engine. I found Google’s tutorials super helpful for navigating a new language. Begin by declaring the region of interest (in this case, Yolo County).
ROI definition
// This ROI is for Yolo county.
var ROI = ee.FeatureCollection("TIGER/2018/Counties")
.filter(ee.Filter.eq("NAME", "Yolo"));
// Choose the mapping space
Map.centerObject(ROI, 9);
Map.addLayer(ROI, {}, 'Region of interest');
Then, define functions to identify clouds and calculate the normalized difference water index , which will be used as a mask.
Function definitions
// This function masks cloudy pixels.
// Function to mask clouds using the quality band of Landsat 8.
var maskClouds = function(image) {
var qa = image.select('pixel_qa');
/// Check that the cloud bit is off.
// See https://www.usgs.gov/land-resources/nli/landsat/landsat-collection-1-level-1-quality-assessment-band
var mask1 = qa.bitwiseAnd(1 << 3).eq(0);
var mask2 = qa.bitwiseAnd(1 << 5).eq(0);
return image.updateMask(mask1).updateMask(mask2);
};
// This function adds NDWI (water index) band to a Landsat 8 image.
var addNDWI = function(image) {
return image
.addBands(image
.normalizedDifference(['B3', 'B5'])
.rename('NDWI'));
};
Now, we can move on to accessing the data we want. Here, I map the functions from up above to the image collection, removing data outside the region of interest, masking out cloudy images, and creating an NDWI band for each image.
Data acquisition
// Load a raw Landsat scene and display it.
var raw = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR')
.filterDate("2020-01-01", "2020-12-31")
.filterBounds(ROI)
.map(maskClouds)
.map(addNDWI);
// Print info about the collection to the console
print("Raw image collection", raw);
// Add raw ImageCollection to mapping panel
// Note that plotting an ImageCollection only shows
// first layer from Collection in space
Map.addLayer(raw,
{bands: ['B4', 'B3', 'B2'],
min: 0,
max: 2000},
'Raw image collection',
false);
Calculate the target pixel value and mask out regions classified as water. Here, I want to use the median value from each pixel to produce a median California. Earth Engine has a good tutorial on creating a greenest pixel composite, which also requires NDVI band - but I don’t want a greenest pixel composite so won’t be using it here. For more on compositing check out this link.
Data wrangling
//// Image reduction (temporal)
//// Calculate median value across time series within ROI
var medImg = raw.median();
// Print to console
print("Median image", medImg);
// Visualize in mapping panel
Map.addLayer(medImg, {bands: ['B4', 'B3', 'B2'], min: 0, max: 2000}, 'Median raw image', false);
//// Image masking (spatial)
var medImgClip = medImg.clip(ROI);
//// Select and visualize water mask
var ndwi = medImgClip.select("NDWI");
// Define viz parameters
var ndwiViz = {min: -1, max: 1, palette: ['00FFFF', '0000FF']};
var rgbViz = {bands: ['B4', 'B3', 'B2'], min: 0, max: 2000};
// Visualize NDWI
Map.addLayer(ndwi, ndwiViz, 'NDWI', false);
// Generate binary water mask image
var waterMask = ndwi.lte(0);
print(waterMask);
// Remove areas from RGB image that are "watery"
var landOnly = medImgClip.updateMask(waterMask);
Map.addLayer(landOnly, rgbViz, 'RGB, Land only', false);
// Remove areas from NDWI image that are land
var waterOnly = ndwi.updateMask(ndwi.gte(0));
Map.addLayer(waterOnly, ndwiViz, 'Water mask', false);
Finally, create an RGB composite of the image and export it to Google Drive. It’s going to have a large file size and consequently it’ll take a long time to download, so grab a good book.
Visualization and export
// Apply visualization parameters to imagery
var RGB_masked = landOnly.visualize(rgbViz);
var RGB_roi = medImgClip.visualize(rgbViz);
Map.addLayer(RGB_roi, {}, 'RGB including water', false);
// Export the image, specifying scale and region. Use
// as fine as product resolution for scale in SMALL ROI's
// BUT DO NOT USE THAT SCALE FOR CA STATE
// IT WILL BE TOO MUCH, MAN!
// AS LONG AS THIS IS FOR DATA VIZ AND NOT EXTRACTION:
// Note that because a 4k display is 3840x2160 pixels
// (which equals 8,294,400 pixels total) there is
// no reason to set maxPixels on the export > 1e8
Export.image.toDrive({
image: RGB_roi,
description: 'Yolo_LC08_RGB_2020_SR',
folder: 'mapTime',
scale: 90,
region: ROI,
maxPixels: 1e8});
// Export the polygon
Export.table.toDrive({
collection: ROI,
description: 'Yolo_ROI',
folder: 'mapTime',
fileFormat: "SHP"});
Confirm the task and create a directory to export the raster image. Depending on the size of your image, Earth Engine may break it up into tiles, but don’t worry: they’re georeferenced, so you can mosaic everything together on your own computer quite easily.
Eventually, the task will be marked completed. If you’ve made it this far, there should be a file (or series of files) in the new subdirectory you just created on Google Drive. Download the entire subdirectory, and proceed with final raster processing. To do this in R, load the relevant libraries, load relevant shapefiles, choose an appropriate coordinate window, and plot the data. For US state borders, I used the USAboundaries dataset. Raster functions in R aren’t super efficient, so one of these days I may try to put a similar approach together for Python, but for the time being the R approach works well enough for my needs.
Plot and save with R and ggplot
## Curious about TOA vs SR? Check out this post for more:
# https://gis.stackexchange.com/questions/304180/what-are-the-min-and-max-values-of-map-addlayer-on-google-earth-engine
#### Import packages ----
library(tidyverse)
library(raster)
library(sf)
library(RStoolbox)
library(cowplot)
#### Import data ----
## Raster
eeImg = raster::stack("data/Sierra_LC08_RGB_2020_SR.tif")
eeImg
## Vector
## Yolo County
ROI = sf::read_sf("data/Sierra_ROI.shp") %>%
st_union()
ROI
## CA State
CA = USAboundaries::us_states() %>%
st_as_sf() %>%
filter(name == "California")
CA
#### Data wrangling ----
## Raster
# Convert to data.frame
# Recall that we set maxPixels in the Earth Engine export to 1e8,
# that's the maximum value you should consider using here as well.
# If your image is going to be a half the screen, you can use half
# the maxpixels.
imgFortFull = fortify(eeImg, maxpixels = 1e8)
imgFort = imgFortFull %>%
rename("R" = vis.red,
"G" = vis.green,
"B" = vis.blue) %>%
filter(R + G + B != 0) %>%
mutate(Rsc = scales::rescale(R, to = c(0,1), from = c(0,255)),
Gsc = scales::rescale(G, to = c(0,1), from = c(0,255)),
Bsc = scales::rescale(B, to = c(0,1), from = c(0,255)))
#### Plots ----
## Vector context plot
vectorPlot = ggplot() +
geom_sf(data = CA,
fill = "grey90") +
geom_sf(data = ROI,
fill = "grey60") +
theme_void() +
theme(panel.background = element_rect(fill = "transparent",
colour = "transparent"),
plot.background = element_rect(fill = "transparent",
colour = "transparent"))
# vectorPlot
## RGB plot
rgbPlot = ggplot() +
geom_tile(data = imgFort,
aes(x = x, y = y,
fill = rgb(Rsc,Gsc,Bsc))) +
scale_fill_identity() +
geom_sf(data = ROI,
fill = "transparent",
lwd = 0.75,
col = "grey40") +
theme_void() +
theme(panel.background = element_rect(fill = "transparent",
colour = "transparent"),
plot.background = element_rect(fill = "transparent",
colour = "transparent"))
# rgbPlot
#### Combine contextual vector and RGB imagery to a single plot with inset
## Generate a "complete" plot that includes contextual vector data
completePlot = ggdraw() +
draw_plot(rgbPlot) +
draw_plot(vectorPlot,
x = 0.075, y = 0.075,
width = 0.4, height = 0.5)
# completePlot
#### Save plots ----
## Give some thought to the output metadata:
# A 4k display is 3840 x 2160 pixels. Therefore there's really no need to have presentation images be >3840 horizontal pixels or >2160 vertical pixels
# Let's do some math. If a figure is to take up half of a ppt slide, then we can happily take it to 1920 horizontal pixels. If that's the case, we can make final image dimensions 7x7 inches, with a resolution of 300 pixels per inch ("dpi") and lose no more information than we would from 4k viewing anyway (final dimensions 2100 x 2100 pixels).
ggsave(filename = "plots/sierraRGB_SR.jpg",
rgbPlot, bg = "transparent",
height = 7, width = 7, units = "in", dpi = 300)
ggsave(filename = "plots/sierraContext_SR.jpg",
completePlot, bg = "transparent",
height = 7, width = 7, units = "in", dpi = 300)
And finally you should end up with an image that looks something like the one below. Recall that a 4k display has 3840x2160 pixels; there’s no reason to shoot for more detail than that if this plot is for a presentation. A 7“x7” image with 300 pixels per inch will end up being 2100x2100 pixels, a nice size for a half-slide image. Saved as a jpg, the final file size comes out to about 816kB.