
Geographic Analysis with R: GeoTIFF Conversion & NDVI CSV Processing
Geographic Analysis with R:
GeoTIFF Conversion & NDVI CSV Processing























1. Introduction - code overview
This web application provides a comprehensive guide and tools for Geographic Information System (GIS) analysis using R programming. It focuses on processing NDVI (Normalized Difference Vegetation Index) data derived from satellite imagery, specifically for the Luhansk region. Users can utilize the R code snippets provided to convert, process, and visualize the data. The application supports generating graphs in both colorful and black-and-white formats, making it ideal for research and academic publication.
The workflow is divided into sections, allowing users to follow step-by-step instructions for achieving their desired outputs. Depending on the user's needs, they can choose between two main workflows:
Colorful Graphs Workflow: Use code from Chapters 2, 3, and 4 sequentially. This workflow generates colorful graphs suitable for digital presentations and displays. It provides a detailed step-by-step guide, displaying code in parts within Chapters 3 and 4, along with detailed documentation and descriptions for each part. This allows users to better understand and customize their workflow, though it may take more time to traverse multiple sections and combine snippets.
Black-and-White Graphs Workflow: Use code from Chapters 2, 5, and 6. This workflow creates black-and-white graphs optimized for printing in academic articles. Chapters 5 and 6 provide singular, streamlined code snippets without additional explanations, making this workflow faster and more practical for most users. It avoids the need to connect multiple parts and works out-of-the-box for generating monochromatic graphs.
Note that Chapter 3, which converts GeoTIFF files into CSV files, is a necessary and shared step in both workflows. By following the appropriate steps, users can efficiently process NDVI data and generate visualizations tailored to their needs. The application provides a modular approach to cater to different presentation and publication formats.
2. Conversion of GeoTIFF Files to CSV
This guide demonstrates the conversion of GeoTIFF satellite imagery files into CSV format using the R programming language. The process is particularly useful for researchers working in GIS or remote sensing who need to preprocess data for statistical analysis or predictive modeling. The example uses NDVI (Normalized Difference Vegetation Index) images from the Luhansk region, extracted from QGIS.
The script utilizes the raster package in R to handle GeoTIFF files and converts them into CSV format. CSV files are more versatile for data analysis in tools such as R, Python, or machine learning models. Before running the code, ensure the raster package is installed.
Check for and install the "raster" package: The script ensures that the required package is installed before proceeding.
Specify input files: GeoTIFF files for NDVI images from different dates are listed.
Conversion function: A function reads the raster data, converts it into a matrix, and saves it as a CSV file.
Batch processing: The function is applied to multiple files to automate the conversion process.
The output files will be saved in the same directory as the input GeoTIFF files, with the .csv extension replacing .tif. This workflow simplifies data preprocessing for further spatial-temporal analysis or predictive modeling.
3. Using RStudio to generate separate graphs for each year
3.1 NDVI Data Import – Processing of CSV Files
The first step in analyzing NDVI data involves preparing your R environment and loading the necessary libraries. This snippet ensures the required packages, dplyr and ggplot2, are installed and loaded. These libraries are essential for data manipulation and visualization tasks.
The snippet also defines the list of CSV file paths, corresponding to the NDVI data from different dates, and specifies an output directory where processed results will be saved. Each file contains numerical NDVI values for the Luhansk region that are used in subsequent steps for analysis and visualization.
3.2 Data Transformation – Preparing NDVI Data for Analysis
This snippet defines the process_file function, which processes individual CSV files containing NDVI values. The function performs the following steps:
File name extraction and date formatting: Extracts the date from the file name and formats it for use in visualizations and results.
Data reading and cleaning: Reads NDVI values from the CSV file and removes invalid or missing values.
Grouping and summarization: Groups NDVI values into predefined ranges and calculates summary statistics, including the count of pixels, area in square meters, and scaled value averages.
Output generation: Saves the summarized data to a new CSV file for further analysis or visualization.
The process_file function is applied to all CSV files, automating the data transformation process and ensuring consistency across the dataset.
3.3 Bar Plot Generation – Visualizing NDVI Data
This section demonstrates how to generate a bar plot using the ggplot2 library in R. The bar plot represents the area (in square meters) for various NDVI value ranges. Below are the steps and details of what the code accomplishes:
Data Mapping: Maps NDVI ranges to the x-axis and calculates the corresponding area (in square meters) for the y-axis.
Customization: Adjusts axis labels, tick marks, grid lines, and plot title for clarity and aesthetics.
Saving Output: Saves the bar plot as a PNG file in the specified output directory.
3.4 Line Plot Generation – NDVI Trends Over Ranges
This section explains how to create a line plot to visualize trends in NDVI values and their corresponding areas. Using the ggplot2 library, the code achieves the following:
Line and Points: Plots NDVI ranges on the x-axis and their associated area (in square meters) on the y-axis, using a red line and point markers for emphasis.
Enhanced Readability: Includes custom axis labels, grid styles, and plot titles to improve visual interpretation.
Output Handling: Saves the resulting plot as a PNG file in the output directory for further use.
3.5 Batch Processing Completion – Finalizing Workflow
This section completes the NDVI analysis by processing multiple CSV files in a batch. The loop ensures consistency and automation across all files. Key actions performed by the code include:
Iterative Processing: Iterates through a predefined list of file paths and applies the process_file function to each file.
Error Handling: Includes error-catching mechanisms to handle issues such as missing files or unexpected data formats.
Completion Logging: Logs the processing completion status for each file, providing a clear record of the workflow.
4. Using RStudio to generate graphs comparing plots over the years
4.1 Setting Up Environment – Checking Packages and Preparing Data
This snippet ensures that the required packages are installed and available for use. It also defines file paths and processes the input data.
Package Management: Checks if the dplyr
, ggplot2
, and readr
libraries are installed. If not, it installs them and loads them into the R environment.
File Paths Definition: Specifies the input CSV files containing NDVI data and sets the output directory for saving processed files and plots.
Data Processing Loop: Iterates through each file path to:
- Extract and format the date from the file name.
- Read the CSV file and handle any errors gracefully.
- Mutate the data to add a formatted date and NDVI range values.
- Combine the processed data into a single data frame.
Filtering Data: Removes rows where the NDVI range is less than -0.1 to focus on valid data.
4.2 Visualizing NDVI Data – Bar and Line Plots
This snippet demonstrates how to create custom visualizations for NDVI data using ggplot2. It includes a bar plot and a line plot.
Custom Theme: Defines a consistent theme for plots, setting styles for titles, legends, axis labels, and grid lines.
Bar Plot: Creates a bar plot that visualizes the area (in square meters) for various NDVI ranges across different dates. Key steps include:
- Mapping NDVI ranges to the x-axis and area to the y-axis.
- Using distinct colors for each date to aid in comparison.
- Adding a vertical line at 0 for reference.
- Saving the plot to the output directory as a PNG file.
Line Plot: Generates a line plot to show NDVI trends over time. Key features include:
- Plotting NDVI ranges on the x-axis and area on the y-axis.
- Using a line and points to represent data clearly.
- Saving the plot to the output directory as a PNG file.
5. Using RStudio to generate separate graphs for each year
A single complete R snippet - black & white version


6. Using RStudio to generate graphs comparing plots over the years
A single complete R snippet - black & white version

