Mission database

The Database, Query and Filter classes in the missions package give access to flight schedule data for mission planning. Currently this is primarily intended for use with OAG data converted from CSV files. The database files use a schema optimized for the query patterns used in AEIC. This optimization is important because the flight schedule data tends to contain a large number of records, and we want to be able to filter them in several different ways.

To understand the way the database is organized, it helps to distinguish between flights and flight instances. A flight instance is a single flight between an origin and a destination airport departing at a given time on a given date. A flight represents a sequence of flight instances by defining a schedule of days of the week and departure time between a pair of effective dates. A flight corresponds to a single row in an input OAG CSV file, and each flight may have multiple flight instances. The flight instances are reified in the SQLite database to make querying the schedule of flights more efficient.

Usage example

Here’s a basic usage example to give an idea of how the API works:

import missions

# Open the database.
db = missions.Database('oag-2019.sqlite')

# Build a scheduled flight query.
query = missions.Query(
    # Filter on flight characteristics.
    filter=missions.Filter(
        # Flight distance between 9500 and 10000 kilometers.
        min_distance=9500, max_distance=10000,
        # Seat capacity >= 500.
        min_seat_capacity=500,
        # Flight origin or destination in US or Canada.
        country=['US', 'CA']
    )
)

# Iterate over the flight results: there are only 46 from this query...
for flight in db(query):
    # ... so it's practical just to print some data.
    print(flight.departure.isoformat(), flight.carrier + flight.flight_number)

Reference documentation

The main classes of interest in the API are:

  • Database: the main database class;

  • Query: a query that returns a sequence of flight instances;

  • QueryResult: a single result from an Query query;

  • FrequentFlightQuery: a query that returns most frequent origin/destination pairs appearing in flight instances;

  • FrequentFlightQueryResult: a single result from an FrequentFlightQuery query;

  • CountQuery: a query that counts flight instances matching given conditions;

  • Filter: a filter on flight characteristics usable with all query types.

Database class

The missions.Database class is a wrapper around a connection to an SQLite database file (using the Python standard library’s sqlite3 package). The Database class hides the details of both the database structure and the underlying SQL interface to SQLite, instead exposing a simple application-specific query API.

The Database class is oriented towards read-only access to a mission database. There is a derived class called WritableDatabase for applications that construct flight databases.

The normal workflow for querying the mission database is to create a Database instance, passing the path to the SQLite database file to the constructor:

db = missions.Database('oag-2019.sqlite')

The Database object is callable, and when you call it with query objects (see below), it returns a Python generator that you can iterate over to get individual results:

for flight in db(missions.Query()):
    print(flight.carrier + flight.flight_id)

(Don’t try to run this code! It will iterate through every flight instance in the database in departure time order. An empty Query selects all flight instances.)

class missions.Database(db_path: str)

Flight schedule database.

Represents a database of flight schedule entries, stored in an SQLite database file, using a schema optimized for common AEIC query use cases.

__call__(query: QueryBase[T]) Generator[T] | T

Execute a query against the database.

Results are returned via a generator that yields instances of the result class for the corresponding query type.

Supported query types are subclasses of QueryBase: Query is a “normal” scheduled flight query, FrequentFlightQuery determines the most frequently occurring airport origin/destination pairs, and CountQuery counts the number of scheduled flights matching filter conditions.

__init__(db_path: str)

Open a flight database file.

Parameters:

db_path (str) – Path to the SQLite database file.

Queries

Database queries come in three flavors. Flight instance queries, represented by the missions.Query class, return flight instances in departure time order, filtered by various criteria. Frequent flight queries, represented by the missions.FrequentFlightQuery class, return pairs of origin and destination airports that have the most flights between them, again filtered by various criteria. Count queries, represented by the missions.CountQuery class, count the number of flight instances that match given conditions.

The filtering criteria for the different query types share some features in common, so the query classes are derived from a QueryBase base class. Each of the query classes has a RESULT_TYPE member that gives the type of the results returned when you run one of these queries.

Base query class

The base query class includes filter parameters for the flight instance start and end dates to consider, as well as an missions.Filter value that filters on flight characteristics (like origin and destination, distance, etc.).

class missions.query.QueryBase(*_args, **_kwargs)

Abstract base class for queries against the mission database.

PROCESS_RESULT: ClassVar[Callable | None] = None

Special processing function for results, or None if not needed.

RESULT_TYPE: ClassVar[type]

Result type returned by a given query class.

end_date: date | None = None

Include only flights on or before this date.

filter: Filter | None = None

Flight filter to apply, or None for no filtering.

start_date: date | None = None

Include only flights on or after this date.

Scheduled flight queries

The missions.Query class returns individual flight instances in departure time order, corresponding to a given set of filter conditions. This query type supports flight characteristics filtering (using missions.Filter), start and end date filtering (from missions.QueryBase) and random and “every nth day” sub-sampling (using the sample and every_nth parameters).

These queries return results as a generator of missions.query.QueryResult values, each of which basically contains all of the known information about the flight instances.

The following examples illustrate some uses of missions.Query.

Return all flight instances for all flights with a distance between 1000 and 5000 kilometers:

q = missions.Query(filter=missions.Filter(min_distance=1000, max_distance=5000))

Return a random 5% sample of flight instances for all flights between France and China:

q = missions.Query(filter=missions.Filter(country=['FR', 'CN']), sample=0.05)

Return all 787 flight instances from France to China departing every 8th day starting on March 1 2019:

q = missions.Query(
    filter=missions.Filter(
        origin_country='FR',
        destination_country='CN',
        aircraft_type='787'
    ),
    start_date=date(2019, 3, 1),
    every_nth=8
)
class missions.Query(filter: Filter | None = None, start_date: date | None = None, end_date: date | None = None, every_nth: int | None = None, sample: float | None = None)

Query for scheduled flights.

RESULT_TYPE

alias of QueryResult

every_nth: int | None = None

Include flights only from every nth day.

sample: float | None = None

Randomly sample a fraction of the results (0.0 < sample <= 1.0).

class missions.query.QueryResult(departure: Timestamp, arrival: Timestamp, carrier: str, flight_number: str, origin: str, origin_country: str, destination: str, destination_country: str, service_type: str, aircraft_type: str, engine_type: str | None, distance: int, seat_capacity: int, flight_id: int)

A single flight query result.

aircraft_type: str

Aircraft type (ICAO code).

arrival: Timestamp

Flight arrival timestamp (UTC).

carrier: str

Airline (IATA code).

departure: Timestamp

Flight departure timestamp (UTC).

destination: str

Destination airport (IATA code).

destination_country: str

Destination country (ISO 3166-1 alpha-2 code).

distance: int

Flight distance in kilometers.

engine_type: str | None

Engine type, or None if not known.

flight_id: int

Unique flight ID.

flight_number: str

Flight number.

origin: str

Origin airport (IATA code).

origin_country: str

Origin country (ISO 3166-1 alpha-2 code).

seat_capacity: int

Seat capacity.

service_type: str

Service type (IATA single-letter code, documented here).

Frequent flights queries

The missions.FrequentFlightQuery class returns airport pairs (discounting the direction, i.e., BOS → LHR is the same as LHR → BOS) and counts of flights between them matching a given filter condition. The filter conditions supported are the same as for flight instance queries, i.e. represented by an missions.Filter instance. Results are returned as a generator of missions.query.FrequentFlightQueryResult values, which contain the airport codes and a count of the number of flight instances.

For example, if we want to find the ten most common routes flown by 787s, we can do:

>>> import missions
>>> db = missions.Database('oag-2019.sqlite')
>>> q = missions.FrequentFlightQuery(filter=missions.Filter(aircraft_type='787'), limit=10)
>>> for f in db(q):
>>>     print(f.airport1, f.airport2, f.number_of_flights)

with output

HAN SGN 3167
HND MYJ 2663
FUK HND 2200
HND ITM 1857
BKK SIN 1726
HIJ HND 1649
ITM OKA 1382
DPS SIN 1216
KIX SIN 1164
DPS MEL 1082
class missions.FrequentFlightQuery(filter: Filter | None = None, start_date: date | None = None, end_date: date | None = None, limit: int = 20)

Query for the most frequent flight routes.

RESULT_TYPE

alias of FrequentFlightQueryResult

limit: int = 20

Maximum number of results to return (default 20).

class missions.query.FrequentFlightQueryResult(airport1: str, airport2: str, number_of_flights: int)

A single frequent flight query result.

airport1: str

First airport (IATA code).

airport2: str

Second airport (IATA code).

number_of_flights: int

Number of flights between the two airports.

Count queries

Sometimes we just want a count of the number of flight instances matching a filter. For example, before running some long computation on each flight instance, it’s useful to know if there are millions of them… Running an missions.CountQuery query returns a single integer count value, i.e., there is no generator involved.

For example, if we want to count the total number of 777 flight instances in the database, we can do:

>>> import missions
>>> db = missions.Database('oag-2019.sqlite')
>>> db(missions.CountQuery(filter=missions.Filter(aircraft_type='777')))
108906
class missions.CountQuery(filter: Filter | None = None, start_date: date | None = None, end_date: date | None = None)

Count scheduled flights.

RESULT_TYPE

alias of int

Filters

class missions.Filter(min_distance: float | None = None, max_distance: float | None = None, min_seat_capacity: int | None = None, max_seat_capacity: int | None = None, airport: str | list[str] | None = None, origin_airport: str | list[str] | None = None, destination_airport: str | list[str] | None = None, country: str | list[str] | None = None, origin_country: str | list[str] | None = None, destination_country: str | list[str] | None = None, continent: str | list[str] | None = None, origin_continent: str | list[str] | None = None, destination_continent: str | list[str] | None = None, bounding_box: BoundingBox | None = None, origin_bounding_box: BoundingBox | None = None, destination_bounding_box: BoundingBox | None = None, service_type: str | list[str] | None = None, aircraft_type: str | list[str] | None = None)

A filter for narrowing down mission flight schedule entries.

This supports filtering by various criteria such as distance, seat capacity, aircraft type, and by geographic location using country, continent, or a bounding box.

All conditions are combined with AND logic.

Spatial filters come in two flavors, “combined” and “origin/destination”, and four region types, “airport”, “country”, “continent”, and “bounding box”.

Combined spatial filters apply to both the origin and destination of flights and express conditions like “flight departs or arrives from an airport in Malaysia”. Origin/destination spatial filters apply only to either the origin or destination of flights. Applying both an origin and a destination filter makes it possible to represent conditions like “flight departs from an airport in France AND arrives at an airport in South America”.

A combined spatial filter cannot be used together with either an origin or destination spatial filter. It is valid to specify either:

  • a single combined spatial filter, or

  • one (optional) spatial filter for the origin and/or one (optional) one for the destination.

This means that, for example, you can specify either: country or origin_country and/or destination_country for country-based filtering, but you may not specify both country and either of origin_country or destination_country. And similarly for airport, continent and bounding box filtering.

Here are some examples:

  • country=’US’ means flights originating or terminating in the US.

  • origin_country=’FR’ means flights originating in France.

  • destination_country=’AT’ means flights terminating in Austria.

  • origin_country=[‘US’, ‘CA’], destination_country=’MX’ means flights originating in the US or Canada and terminating in Mexico.

  • origin_country=[‘DE’, ‘CH’, ‘AT’], destination_continent=’SA’ means flights originating in either Germany, Switzerland, or Austria and terminating at any airport in South America.

aircraft_type: str | list[str] | None = None

Aircraft type(s) (e.g., ‘737’, ‘320’).

airport: str | list[str] | None = None

Originating or terminating airport code(s).

bounding_box: BoundingBox | None = None

Bounding box for originating or terminating airports.

continent: str | list[str] | None = None

Originating or terminating continent code(s).

country: str | list[str] | None = None

Originating or terminating country code(s).

destination_airport: str | list[str] | None = None

Terminating airport code(s).

destination_bounding_box: BoundingBox | None = None

Bounding box for terminating airports.

destination_continent: str | list[str] | None = None

Terminating continent code(s).

destination_country: str | list[str] | None = None

Terminating country code(s).

max_distance: float | None = None

Maximum flight distance in kilometers.

max_seat_capacity: int | None = None

Maximum seat capacity.

min_distance: float | None = None

Minimum flight distance in kilometers.

min_seat_capacity: int | None = None

Minimum seat capacity.

origin_airport: str | list[str] | None = None

Originating airport code(s).

origin_bounding_box: BoundingBox | None = None

Bounding box for originating airports.

origin_continent: str | list[str] | None = None

Originating continent code(s).

origin_country: str | list[str] | None = None

Originating country code(s).

service_type: str | list[str] | None = None

Service type(s) (using IATA single-letter codes, documented here).

Database schema

The database schema for the mission database is described on the GitHub wiki for AEIC. (The wiki page is slightly outdated: the definitive documentation for the database schema is _ensure_schema method of the missions.WritableDatabase class.)