Mission database
The Database, Query and Filter classes
in the AEIC.missions package give access to flight schedule data for mission
planning. Currently this is primarily intended for use with OAG data converted
from CSV files. The database files use a schema optimized for the query
patterns used in AEIC. This optimization is important because the flight
schedule data tends to contain a large number of records, and we want to be
able to filter them in several different ways.
To understand the way the database is organized, it helps to distinguish between flights and flight instances. A flight instance is a single flight between an origin and a destination airport departing at a given time on a given date. A flight represents a sequence of flight instances by defining a schedule of days of the week and departure time between a pair of effective dates. A flight corresponds to a single row in an input OAG CSV file, and each flight may have multiple flight instances. The flight instances are reified in the SQLite database to make querying the schedule of flights more efficient.
Usage example
Here’s a basic usage example to give an idea of how the API works:
from AEIC import missions
# Open the database.
db = missions.Database('oag-2019.sqlite')
# Build a scheduled flight query.
query = missions.Query(
# Filter on flight characteristics.
filter=missions.Filter(
# Flight distance between 9500 and 10000 kilometers.
min_distance=9500, max_distance=10000,
# Seat capacity >= 500.
min_seat_capacity=500,
# Flight origin or destination in US or Canada.
country=['US', 'CA']
)
)
# Iterate over the flight results: there are only 46 from this query...
for flight in db(query):
# ... so it's practical just to print some data.
print(flight.departure.isoformat(), flight.carrier + flight.flight_number)
Reference documentation
The main classes of interest in the API are:
Database: the main database class;Query: a query that returns a sequence of flight instances;QueryResult: a single result from anQueryquery;FrequentFlightQuery: a query that returns most frequent origin/destination pairs appearing in flight instances;FrequentFlightQueryResult: a single result from anFrequentFlightQueryquery;CountQuery: a query that counts flight instances matching given conditions;Filter: a filter on flight characteristics usable with all query types.
Database class
The AEIC.missions.Database class is a wrapper around a connection
to an SQLite database file (using the Python standard library’s
sqlite3 package). The
Database class hides the details of both
the database structure and the underlying SQL interface to SQLite, instead
exposing a simple application-specific query API.
The Database class is oriented towards
read-only access to a mission database. There is a derived class called
WritableDatabase for applications that construct flight databases.
The normal workflow for querying the mission database is to create a
Database instance, passing the path to
the SQLite database file to the constructor:
db = AEIC.missions.Database('oag-2019.sqlite')
The Database object is callable, and when
you call it with query objects (see below), it returns a Python
generator that
you can iterate over to get individual results:
for flight in db(AEIC.missions.Query()):
print(flight.carrier + flight.flight_id)
(Don’t try to run this code! It will iterate through every flight
instance in the database in departure time order. An empty Query selects all flight instances.)
- class AEIC.missions.Database(db_path)
Flight schedule database.
Represents a database of flight schedule entries, stored in an SQLite database file, using a schema optimized for common AEIC query use cases.
- Parameters:
db_path (str)
- __call__(query)
Execute a query against the database.
Results are returned via a generator that yields instances of the result class for the corresponding query type.
Supported query types are subclasses of QueryBase: Query is a “normal” scheduled flight query, FrequentFlightQuery determines the most frequently occurring airport origin/destination pairs, and CountQuery counts the number of scheduled flights matching filter conditions.
- Parameters:
query (QueryBase[T])
- Return type:
Generator[T] | T
- __init__(db_path)
Open a flight database file.
- Parameters:
db_path (str) – Path to the SQLite database file.
Queries
Database queries come in three flavors. Flight instance queries,
represented by the AEIC.missions.Query class, return flight
instances in departure time order, filtered by various criteria. Frequent
flight queries, represented by the
AEIC.missions.FrequentFlightQuery class, return pairs of origin
and destination airports that have the most flights between them, again
filtered by various criteria. Count queries, represented by the
AEIC.missions.CountQuery class, count the number of flight
instances that match given conditions.
The filtering criteria for the different query types share some features in
common, so the query classes are derived from a QueryBase base class. Each of the query classes has a
RESULT_TYPE member that
gives the type of the results returned when you run one of these queries.
Base query class
The base query class includes filter parameters for the flight instance
start and end dates to consider, as well as an
AEIC.missions.Filter value that filters on flight
characteristics (like origin and destination, distance, etc.).
- class AEIC.missions.query.QueryBase(*_args, **_kwargs)
Abstract base class for queries against the mission database.
- PROCESS_RESULT: ClassVar[Callable | None] = None
Special processing function for results, or None if not needed.
- RESULT_TYPE: ClassVar[type]
Result type returned by a given query class.
- end_date: date | None = None
Include only flights on or before this date.
- start_date: date | None = None
Include only flights on or after this date.
Scheduled flight queries
The AEIC.missions.Query class returns individual flight
instances in departure time order, corresponding to a given set of filter
conditions. This query type supports flight characteristics filtering
(using AEIC.missions.Filter), start and end date filtering (from
AEIC.missions.QueryBase) and random and “every nth day”
sub-sampling (using the sample and every_nth parameters).
These queries return results as a generator of
AEIC.missions.query.QueryResult values, each of which basically
contains all of the known information about the flight instances.
The following examples illustrate some uses of
AEIC.missions.Query.
Return all flight instances for all flights with a distance between 1000 and 5000 kilometers:
q = AEIC.missions.Query(filter=AEIC.missions.Filter(min_distance=1000, max_distance=5000))
Return a random 5% sample of flight instances for all flights between France and China:
q = AEIC.missions.Query(filter=AEIC.missions.Filter(country=['FR', 'CN']), sample=0.05)
Return all 787 flight instances from France to China departing every 8th day starting on March 1 2019:
q = AEIC.missions.Query(
filter=AEIC.missions.Filter(
origin_country='FR',
destination_country='CN',
aircraft_type='787'
),
start_date=date(2019, 3, 1),
every_nth=8
)
- class AEIC.missions.Query(filter=None, start_date=None, end_date=None, every_nth=None, sample=None, limit=None, offset=None)
Query for scheduled flights.
- Parameters:
filter (Filter | None)
start_date (date | None)
end_date (date | None)
every_nth (int | None)
sample (float | None)
limit (int | None)
offset (int | None)
- RESULT_TYPE
alias of
QueryResult
- every_nth: int | None = None
Include flights only from every nth day.
- limit: int | None = None
Maximum number of results to return.
- offset: int | None = None
Number of results to skip before returning results.
- sample: float | None = None
Randomly sample a fraction of the results (0.0 < sample <= 1.0).
- class AEIC.missions.query.QueryResult(departure, arrival, carrier, flight_number, origin, origin_country, destination, destination_country, service_type, aircraft_type, engine_type, distance, seat_capacity, id, flight_id)
A single flight query result.
- Parameters:
departure (Timestamp)
arrival (Timestamp)
carrier (str)
flight_number (str)
origin (str)
origin_country (str)
destination (str)
destination_country (str)
service_type (str)
aircraft_type (str)
engine_type (str | None)
distance (int)
seat_capacity (int)
id (int)
flight_id (int)
- aircraft_type: str
Aircraft type (ICAO code).
- arrival: Timestamp
Flight arrival timestamp (UTC).
- carrier: str
Airline (IATA code).
- departure: Timestamp
Flight departure timestamp (UTC).
- destination: str
Destination airport (IATA code).
- destination_country: str
Destination country (ISO 3166-1 alpha-2 code).
- distance: int
Flight distance in kilometers.
- engine_type: str | None
Engine type, or None if not known.
- flight_id: int
Unique flight ID.
- flight_number: str
Flight number.
- id: int
Unique flight instance ID.
- origin: str
Origin airport (IATA code).
- origin_country: str
Origin country (ISO 3166-1 alpha-2 code).
- seat_capacity: int
Seat capacity.
Frequent flights queries
The AEIC.missions.FrequentFlightQuery class returns airport pairs
(discounting the direction, i.e., BOS → LHR is the same as LHR → BOS) and
counts of flights between them matching a given filter condition. The filter
conditions supported are the same as for flight instance queries, i.e.
represented by an AEIC.missions.Filter instance. Results are
returned as a generator of
AEIC.missions.query.FrequentFlightQueryResult values, which
contain the airport codes and a count of the number of flight instances.
For example, if we want to find the ten most common routes flown by 787s, we can do:
>>> from AEIC import missions
>>> db = AEIC.missions.Database('oag-2019.sqlite')
>>> q = AEIC.missions.FrequentFlightQuery(filter=AEIC.missions.Filter(aircraft_type='787'), limit=10)
>>> for f in db(q):
>>> print(f.airport1, f.airport2, f.number_of_flights)
with output
HAN SGN 3167
HND MYJ 2663
FUK HND 2200
HND ITM 1857
BKK SIN 1726
HIJ HND 1649
ITM OKA 1382
DPS SIN 1216
KIX SIN 1164
DPS MEL 1082
- class AEIC.missions.FrequentFlightQuery(filter=None, start_date=None, end_date=None, limit=20)
Query for the most frequent flight routes.
- Parameters:
filter (Filter | None)
start_date (date | None)
end_date (date | None)
limit (int)
- RESULT_TYPE
alias of
FrequentFlightQueryResult
- limit: int = 20
Maximum number of results to return (default 20).
- class AEIC.missions.query.FrequentFlightQueryResult(airport1, airport2, number_of_flights)
A single frequent flight query result.
- Parameters:
airport1 (str)
airport2 (str)
number_of_flights (int)
- airport1: str
First airport (IATA code).
- airport2: str
Second airport (IATA code).
- number_of_flights: int
Number of flights between the two airports.
Count queries
Sometimes we just want a count of the number of flight instances matching
a filter. For example, before running some long computation on each flight
instance, it’s useful to know if there are millions of them… Running an
AEIC.missions.CountQuery query returns a single integer count
value, i.e., there is no generator involved.
For example, if we want to count the total number of 777 flight instances in the database, we can do:
>>> from AEIC import missions
>>> db = AEIC.missions.Database('oag-2019.sqlite')
>>> db(AEIC.missions.CountQuery(filter=AEIC.missions.Filter(aircraft_type='777')))
108906
Filters
- class AEIC.missions.Filter(min_distance=None, max_distance=None, min_seat_capacity=None, max_seat_capacity=None, airport=None, origin_airport=None, destination_airport=None, country=None, origin_country=None, destination_country=None, continent=None, origin_continent=None, destination_continent=None, bounding_box=None, origin_bounding_box=None, destination_bounding_box=None, service_type=None, aircraft_type=None)
A filter for narrowing down mission flight schedule entries.
This supports filtering by various criteria such as distance, seat capacity, aircraft type, and by geographic location using country, continent, or a bounding box.
All conditions are combined with AND logic.
Spatial filters come in two flavors, “combined” and “origin/destination”, and four region types, “airport”, “country”, “continent”, and “bounding box”.
Combined spatial filters apply to both the origin and destination of flights and express conditions like “flight departs or arrives from an airport in Malaysia”. Origin/destination spatial filters apply only to either the origin or destination of flights. Applying both an origin and a destination filter makes it possible to represent conditions like “flight departs from an airport in France AND arrives at an airport in South America”.
A combined spatial filter cannot be used together with either an origin or destination spatial filter. It is valid to specify either:
a single combined spatial filter, or
one (optional) spatial filter for the origin and/or one (optional) one for the destination.
This means that, for example, you can specify either: country or origin_country and/or destination_country for country-based filtering, but you may not specify both country and either of origin_country or destination_country. And similarly for airport, continent and bounding box filtering.
Here are some examples:
country=’US’ means flights originating or terminating in the US.
origin_country=’FR’ means flights originating in France.
destination_country=’AT’ means flights terminating in Austria.
origin_country=[‘US’, ‘CA’], destination_country=’MX’ means flights originating in the US or Canada and terminating in Mexico.
origin_country=[‘DE’, ‘CH’, ‘AT’], destination_continent=’SA’ means flights originating in either Germany, Switzerland, or Austria and terminating at any airport in South America.
- Parameters:
min_distance (float | None)
max_distance (float | None)
min_seat_capacity (int | None)
max_seat_capacity (int | None)
airport (str | list[str] | None)
origin_airport (str | list[str] | None)
destination_airport (str | list[str] | None)
country (str | list[str] | None)
origin_country (str | list[str] | None)
destination_country (str | list[str] | None)
continent (str | list[str] | None)
origin_continent (str | list[str] | None)
destination_continent (str | list[str] | None)
bounding_box (BoundingBox | None)
origin_bounding_box (BoundingBox | None)
destination_bounding_box (BoundingBox | None)
service_type (str | list[str] | None)
aircraft_type (str | list[str] | None)
- aircraft_type: str | list[str] | None = None
Aircraft type(s) (e.g., ‘737’, ‘320’).
- airport: str | list[str] | None = None
Originating or terminating airport code(s).
- bounding_box: BoundingBox | None = None
Bounding box for originating or terminating airports.
- continent: str | list[str] | None = None
Originating or terminating continent code(s).
- country: str | list[str] | None = None
Originating or terminating country code(s).
- destination_airport: str | list[str] | None = None
Terminating airport code(s).
- destination_bounding_box: BoundingBox | None = None
Bounding box for terminating airports.
- destination_continent: str | list[str] | None = None
Terminating continent code(s).
- destination_country: str | list[str] | None = None
Terminating country code(s).
- max_distance: float | None = None
Maximum flight distance in kilometers.
- max_seat_capacity: int | None = None
Maximum seat capacity.
- min_distance: float | None = None
Minimum flight distance in kilometers.
- min_seat_capacity: int | None = None
Minimum seat capacity.
- origin_airport: str | list[str] | None = None
Originating airport code(s).
- origin_bounding_box: BoundingBox | None = None
Bounding box for originating airports.
- origin_continent: str | list[str] | None = None
Originating continent code(s).
- origin_country: str | list[str] | None = None
Originating country code(s).
Database schema
The database schema for the mission database is described on the GitHub
wiki for AEIC. (The wiki
page is slightly outdated: the definitive documentation for the database
schema is _ensure_schema method of the
AEIC.missions.WritableDatabase class.)