Mission database
The Database, Query and Filter classes in the missions package give
access to flight schedule data for mission planning. Currently this is
primarily intended for use with OAG data converted from CSV files. The
database files use a schema optimized for the query patterns used in AEIC.
This optimization is important because the flight schedule data tends to
contain a large number of records, and we want to be able to filter them in
several different ways.
To understand the way the database is organized, it helps to distinguish between flights and flight instances. A flight instance is a single flight between an origin and a destination airport departing at a given time on a given date. A flight represents a sequence of flight instances by defining a schedule of days of the week and departure time between a pair of effective dates. A flight corresponds to a single row in an input OAG CSV file, and each flight may have multiple flight instances. The flight instances are reified in the SQLite database to make querying the schedule of flights more efficient.
Usage example
Here’s a basic usage example to give an idea of how the API works:
import missions
# Open the database.
db = missions.Database('oag-2019.sqlite')
# Build a scheduled flight query.
query = missions.Query(
# Filter on flight characteristics.
filter=missions.Filter(
# Flight distance between 9500 and 10000 kilometers.
min_distance=9500, max_distance=10000,
# Seat capacity >= 500.
min_seat_capacity=500,
# Flight origin or destination in US or Canada.
country=['US', 'CA']
)
)
# Iterate over the flight results: there are only 46 from this query...
for flight in db(query):
# ... so it's practical just to print some data.
print(flight.departure.isoformat(), flight.carrier + flight.flight_number)
Reference documentation
The main classes of interest in the API are:
Database: the main database class;Query: a query that returns a sequence of flight instances;QueryResult: a single result from anQueryquery;FrequentFlightQuery: a query that returns most frequent origin/destination pairs appearing in flight instances;FrequentFlightQueryResult: a single result from anFrequentFlightQueryquery;CountQuery: a query that counts flight instances matching given conditions;Filter: a filter on flight characteristics usable with all query types.
Database class
The missions.Database class is a wrapper around a connection to an SQLite
database file (using the Python standard library’s
sqlite3 package). The
Database class hides the details of both the database structure and the
underlying SQL interface to SQLite, instead exposing a simple
application-specific query API.
The Database class is oriented towards read-only access to a mission
database. There is a derived class called WritableDatabase for applications
that construct flight databases.
The normal workflow for querying the mission database is to create a
Database instance, passing the path to the SQLite database file to the
constructor:
db = missions.Database('oag-2019.sqlite')
The Database object is callable, and when you call it with query objects
(see below), it returns a Python
generator that
you can iterate over to get individual results:
for flight in db(missions.Query()):
print(flight.carrier + flight.flight_id)
(Don’t try to run this code! It will iterate through every flight
instance in the database in departure time order. An empty Query selects
all flight instances.)
- class missions.Database(db_path: str)
Flight schedule database.
Represents a database of flight schedule entries, stored in an SQLite database file, using a schema optimized for common AEIC query use cases.
- __call__(query: QueryBase[T]) Generator[T] | T
Execute a query against the database.
Results are returned via a generator that yields instances of the result class for the corresponding query type.
Supported query types are subclasses of QueryBase: Query is a “normal” scheduled flight query, FrequentFlightQuery determines the most frequently occurring airport origin/destination pairs, and CountQuery counts the number of scheduled flights matching filter conditions.
- __init__(db_path: str)
Open a flight database file.
- Parameters:
db_path (str) – Path to the SQLite database file.
Queries
Database queries come in three flavors. Flight instance queries,
represented by the missions.Query class, return flight instances in
departure time order, filtered by various criteria. Frequent flight queries,
represented by the missions.FrequentFlightQuery class, return pairs of
origin and destination airports that have the most flights between them, again
filtered by various criteria. Count queries, represented by the
missions.CountQuery class, count the number of flight instances that
match given conditions.
The filtering criteria for the different query types share some features in
common, so the query classes are derived from a QueryBase base class. Each
of the query classes has a RESULT_TYPE member that gives the type of the
results returned when you run one of these queries.
Base query class
The base query class includes filter parameters for the flight instance
start and end dates to consider, as well as an missions.Filter value that
filters on flight characteristics (like origin and destination, distance,
etc.).
- class missions.query.QueryBase(*_args, **_kwargs)
Abstract base class for queries against the mission database.
- PROCESS_RESULT: ClassVar[Callable | None] = None
Special processing function for results, or None if not needed.
- RESULT_TYPE: ClassVar[type]
Result type returned by a given query class.
- end_date: date | None = None
Include only flights on or before this date.
- start_date: date | None = None
Include only flights on or after this date.
Scheduled flight queries
The missions.Query class returns individual flight instances in
departure time order, corresponding to a given set of filter conditions. This
query type supports flight characteristics filtering (using
missions.Filter), start and end date filtering (from missions.QueryBase)
and random and “every nth day” sub-sampling (using the sample and
every_nth parameters).
These queries return results as a generator of missions.query.QueryResult
values, each of which basically contains all of the known information about
the flight instances.
The following examples illustrate some uses of missions.Query.
Return all flight instances for all flights with a distance between 1000 and 5000 kilometers:
q = missions.Query(filter=missions.Filter(min_distance=1000, max_distance=5000))
Return a random 5% sample of flight instances for all flights between France and China:
q = missions.Query(filter=missions.Filter(country=['FR', 'CN']), sample=0.05)
Return all 787 flight instances from France to China departing every 8th day starting on March 1 2019:
q = missions.Query(
filter=missions.Filter(
origin_country='FR',
destination_country='CN',
aircraft_type='787'
),
start_date=date(2019, 3, 1),
every_nth=8
)
- class missions.Query(filter: Filter | None = None, start_date: date | None = None, end_date: date | None = None, every_nth: int | None = None, sample: float | None = None)
Query for scheduled flights.
- RESULT_TYPE
alias of
QueryResult
- every_nth: int | None = None
Include flights only from every nth day.
- sample: float | None = None
Randomly sample a fraction of the results (0.0 < sample <= 1.0).
- class missions.query.QueryResult(departure: Timestamp, arrival: Timestamp, carrier: str, flight_number: str, origin: str, origin_country: str, destination: str, destination_country: str, service_type: str, aircraft_type: str, engine_type: str | None, distance: int, seat_capacity: int, flight_id: int)
A single flight query result.
- aircraft_type: str
Aircraft type (ICAO code).
- arrival: Timestamp
Flight arrival timestamp (UTC).
- carrier: str
Airline (IATA code).
- departure: Timestamp
Flight departure timestamp (UTC).
- destination: str
Destination airport (IATA code).
- destination_country: str
Destination country (ISO 3166-1 alpha-2 code).
- distance: int
Flight distance in kilometers.
- engine_type: str | None
Engine type, or None if not known.
- flight_id: int
Unique flight ID.
- flight_number: str
Flight number.
- origin: str
Origin airport (IATA code).
- origin_country: str
Origin country (ISO 3166-1 alpha-2 code).
- seat_capacity: int
Seat capacity.
Frequent flights queries
The missions.FrequentFlightQuery class returns airport pairs (discounting the
direction, i.e., BOS → LHR is the same as LHR → BOS) and counts of flights
between them matching a given filter condition. The filter conditions
supported are the same as for flight instance queries, i.e. represented by
an missions.Filter instance. Results are returned as a generator of
missions.query.FrequentFlightQueryResult values, which contain the airport codes
and a count of the number of flight instances.
For example, if we want to find the ten most common routes flown by 787s, we can do:
>>> import missions
>>> db = missions.Database('oag-2019.sqlite')
>>> q = missions.FrequentFlightQuery(filter=missions.Filter(aircraft_type='787'), limit=10)
>>> for f in db(q):
>>> print(f.airport1, f.airport2, f.number_of_flights)
with output
HAN SGN 3167
HND MYJ 2663
FUK HND 2200
HND ITM 1857
BKK SIN 1726
HIJ HND 1649
ITM OKA 1382
DPS SIN 1216
KIX SIN 1164
DPS MEL 1082
- class missions.FrequentFlightQuery(filter: Filter | None = None, start_date: date | None = None, end_date: date | None = None, limit: int = 20)
Query for the most frequent flight routes.
- RESULT_TYPE
alias of
FrequentFlightQueryResult
- limit: int = 20
Maximum number of results to return (default 20).
Count queries
Sometimes we just want a count of the number of flight instances matching
a filter. For example, before running some long computation on each flight
instance, it’s useful to know if there are millions of them… Running an
missions.CountQuery query returns a single integer count value, i.e., there is no
generator involved.
For example, if we want to count the total number of 777 flight instances in the database, we can do:
>>> import missions
>>> db = missions.Database('oag-2019.sqlite')
>>> db(missions.CountQuery(filter=missions.Filter(aircraft_type='777')))
108906
Filters
- class missions.Filter(min_distance: float | None = None, max_distance: float | None = None, min_seat_capacity: int | None = None, max_seat_capacity: int | None = None, airport: str | list[str] | None = None, origin_airport: str | list[str] | None = None, destination_airport: str | list[str] | None = None, country: str | list[str] | None = None, origin_country: str | list[str] | None = None, destination_country: str | list[str] | None = None, continent: str | list[str] | None = None, origin_continent: str | list[str] | None = None, destination_continent: str | list[str] | None = None, bounding_box: BoundingBox | None = None, origin_bounding_box: BoundingBox | None = None, destination_bounding_box: BoundingBox | None = None, service_type: str | list[str] | None = None, aircraft_type: str | list[str] | None = None)
A filter for narrowing down mission flight schedule entries.
This supports filtering by various criteria such as distance, seat capacity, aircraft type, and by geographic location using country, continent, or a bounding box.
All conditions are combined with AND logic.
Spatial filters come in two flavors, “combined” and “origin/destination”, and four region types, “airport”, “country”, “continent”, and “bounding box”.
Combined spatial filters apply to both the origin and destination of flights and express conditions like “flight departs or arrives from an airport in Malaysia”. Origin/destination spatial filters apply only to either the origin or destination of flights. Applying both an origin and a destination filter makes it possible to represent conditions like “flight departs from an airport in France AND arrives at an airport in South America”.
A combined spatial filter cannot be used together with either an origin or destination spatial filter. It is valid to specify either:
a single combined spatial filter, or
one (optional) spatial filter for the origin and/or one (optional) one for the destination.
This means that, for example, you can specify either: country or origin_country and/or destination_country for country-based filtering, but you may not specify both country and either of origin_country or destination_country. And similarly for airport, continent and bounding box filtering.
Here are some examples:
country=’US’ means flights originating or terminating in the US.
origin_country=’FR’ means flights originating in France.
destination_country=’AT’ means flights terminating in Austria.
origin_country=[‘US’, ‘CA’], destination_country=’MX’ means flights originating in the US or Canada and terminating in Mexico.
origin_country=[‘DE’, ‘CH’, ‘AT’], destination_continent=’SA’ means flights originating in either Germany, Switzerland, or Austria and terminating at any airport in South America.
- aircraft_type: str | list[str] | None = None
Aircraft type(s) (e.g., ‘737’, ‘320’).
- airport: str | list[str] | None = None
Originating or terminating airport code(s).
- bounding_box: BoundingBox | None = None
Bounding box for originating or terminating airports.
- continent: str | list[str] | None = None
Originating or terminating continent code(s).
- country: str | list[str] | None = None
Originating or terminating country code(s).
- destination_airport: str | list[str] | None = None
Terminating airport code(s).
- destination_bounding_box: BoundingBox | None = None
Bounding box for terminating airports.
- destination_continent: str | list[str] | None = None
Terminating continent code(s).
- destination_country: str | list[str] | None = None
Terminating country code(s).
- max_distance: float | None = None
Maximum flight distance in kilometers.
- max_seat_capacity: int | None = None
Maximum seat capacity.
- min_distance: float | None = None
Minimum flight distance in kilometers.
- min_seat_capacity: int | None = None
Minimum seat capacity.
- origin_airport: str | list[str] | None = None
Originating airport code(s).
- origin_bounding_box: BoundingBox | None = None
Bounding box for originating airports.
- origin_continent: str | list[str] | None = None
Originating continent code(s).
- origin_country: str | list[str] | None = None
Originating country code(s).
Database schema
The database schema for the mission database is described on the GitHub
wiki for AEIC. (The wiki
page is slightly outdated: the definitive documentation for the database
schema is _ensure_schema method of the missions.WritableDatabase class.)