SCL Data - Data Ecosystem Working Group
This Python script takes geospatial data in the form of shapefiles and tile data to perform connectivity analysis. It performs spatial joins, data conversions, and statistical analysis, providing insights into average download and upload speeds for the Latin America and the Caribbean Region. The output can be saved in CSV format or visualized as plots. This script is useful for studying connectivity trends and patterns in various geographic regions.
The script is designed to be used in an AWS environment and expects certain environment variables and input files to be available in specific locations. The data sources used include Ookla OpenData and local geospatial basemaps.
The script is divided into several sections:
-
Environment setup: Loading of environment variables and import of necessary Python libraries.
-
Data loading: Retrieval of shapefile data and tile URLs.
-
Data preprocessing: Conversion of data units and spatial joining of tile and shapefile data.
-
Statistical analysis: Computation of weighted averages for download and upload speeds.
-
Data output: Saving of processed data in CSV format.
-
Visualization: Creation of plots for download speeds at different administrative levels.
The script uses the GeoPandas library to perform spatial joins between the tile data and the shapefiles for administrative levels 1 and 2. This allows the connectivity data to be linked with geographic regions.
After joining the data, it computes the weighted averages for download and upload speeds based on the number of tests conducted. The results are saved as CSV files.
Finally, it creates plots to visualize download speeds at different administrative levels. The plots are saved as JPEG images.
Name | Task | Status |
---|---|---|
Generalize Code | Make the code applicable for any country, quarter, and type of connectivity. | Pending |
Add Mobile | Incorporate mobile connectivity data in the analysis. | Pending |
Improve Efficiency | Work on optimizing the code to reduce the running time. | Pending |
Generate Table | Implement functionality to generate a distinct table for each type of analysis. | Pending |
We would like to express our gratitude to Ookla for making their Open Data available. This project uses data and is based on Ookla Open Data repository. The data used includes tile data that provide valuable insights into global connectivity trends.
The raw data, its structure, and collection methodology can be found in the Ookla Open Data repository on GitHub. All rights concerning the Ookla data belong to Ookla.