VAULT converts AIS ship records into clickable tracks that show satellite coverage for the location clicked. The demo shows both Jupyter maps (via iPyLeaflet) and Google-Earth style maps (using OpenSphere).
For demo purposes, we treat all entries in the TLE file as viable satellites. In reality most of these are space junk, but within 10 years there could be over 10,000 cubesats.
The project also contains scripts & notebooks to ingest, characterize, and clean the AIS & TLE data, detect outliers, and cluster tracks.
Further documents are hosted on GitHub Pages.
These instructions require Docker (obtained here) and a DockerHub account (obtained here).
Once docker is installed and the DockerHub account is established, in your operating system command line run:
docker login
docker pull ke2jacobs/jacobs-vault-nb-with-data:latest
docker run --rm -it -p 2080:2080 ke2jacobs/jacobs-vault-nb-with-data:latest
The container will present a message:
To access the notebook, open this file in a browser:
file:///root/.local/share/jupyter/runtime/nbserver-1-open.html
Or copy and paste one of these URLs:
http://d20796a21c64:2080/?token=455d272e289a90dbc2533de2cb7ddec9a5574dc3fac5ef66
or
http://127.0.0.1:2080/?token=455d272e289a90dbc2533de2cb7ddec9a5574dc3fac5ef66
Open a browser window and paste the URL into the browser (http://127.0.0.1: … is usually the best choice)
The main Jupyter page will be displayed.
- Click on the folder named "demo"
- Click on the file named "VAULT_Demo.ipynb"
- Click on the "Cell" menu item and choose "Run All"
To explore further,
- Clone the repository with ...
git clone https://github.com/cmorris-jacobs/jacobs-vault
cd jacobs-vault
- Obtain a copy of the VAULT data, and put it in
jacobs-vault/data
, for example with :
ln -s path/to/data data
- Using either conda or pip, create the vault virtual environment so you have the required packages. Using conda, that would be:
conda create -f environment.yml
Wait while it installs packages...
- Activate the vault virtual environment.
conda activate vault
- To run the demo from here:
cd demo
ln -s ../data ./
. run.sh
The ln
line just makes ETL'd data visible from demo/data
. Use other techniques if you prefer.
- If you plan to be pushing to git, then use nbdev to install git hooks - mostly relevant to notebooks in
nbs/
:
nbdev_install_git_hooks
If you will be modifying notebooks in that folder, familiarize yourself with fastai's nbdev, and do a make
before commit/push to update associated modules and docs.
Jacobs-VAULT is the result of a hackathon challenge, so in addition to a working demo and analysis notebooks, it still has exploratory paths and alternate approaches. Folders are in three rough groups:
etl
- Original ETL scripts, mostly Spark SQL and Hive.ais-analytics
- Subproject Spark to analyze AIS data. Alternate ETL.geotransformer
- Subproject using Spark to analyze TLE data. Alternate ETL. Directly callssgp4
and theastropy
package, instead ofskyfield
.
A mix of exploratory notebooks and literate programming notebooks that generate Python modules and documentation (including this README) via the nbdev
package. Controlled by the toplevel Makefile
, using the vault
virtual environment captured in environment.yml
.
nbs
- Toplevel notebooks, generate docs, modules, and README.jacobs_vault
- Python modules generated fromnbs/
bynbdev
packagedocs
- Documentation generated fromnbs/
bynbdev
packagedata
- (See "Demo Folders".)
To run the notebooks in the nbs/
folder,
activate the vault
Python virtual environment and start a new jupyter kernel.
conda env -f environment.yml
conda activate vault
jupyter notebook
That should start a jupyter session in your browser. You can now explore and run the notebooks in the nbs/
folder.
The demo supports a notebook with an interactive map-based walktrhough of getting AIS tracks, and querying a track for satellite coverage, using the Skyfield
package for ephemeris calculations.
demo
- As much of the demo as possible lives under here, for completeness.data
- Daily satellite files stored (or linked) asdata/VAULT_Data/TLE_daily/
year/
MM/
nn.tab.gz
. Used by the demo and other notebooks & scripts.
autoencoder
- Exploratory work using a PyTorch deep network to discover high-level features and pattersn in the AIS data.ais-kml
- Concurrent visualization attempt using OpenSphere.hittestservice
- First attempt to wrap HitTest code into a web service.scripts
- A collection of scripts, esp. SQL queries.
Some code expects an Apache Spark setup with Hive and Hadoop available. The ais-analytics
and geotransformer
folders contain cookie-cutter
setups with scripts that
will start Spark-enabled jupyter notebooks, or launch a spark job with the required virtual environment.
The environment.yml
file invoked above has the full package list, but key top-level packages fall in three broad categories:
- Core: numpy, pandas
- Astronomy: Skyfield, astropy, GDAL, (pyorbital??)
- Clustering: HDBSCAN
- Notebooks: ipython, jupyter notebooks.
- Spark including PySpark, Hive, Hadoop
- (Other database as req'd)
- Map support: geopandas, ...
- Visualization: plotly, (matplotlib?), (leaflet?), (opensphere?)
Tests are automatically extracted from notebooks in nbs/
. To run the tests in parallel, launch:
nbdev_test_nbs
or make test
For all the tests to pass, you'll need to install the following optional dependencies:
pip install ...
Tests are written using nbdev
, for example see the documentation for hit_quality
or viz
.