Skip to content
← Back to blog

What does it cost to see the world? Optimizing a flexible multi-stop flight search

·10 min read

Every few years, I get that travel itch — after all, I am also a German citizen, a country that is well known for its high per-capita international tourism expenditure. But living and working in the US, one is constantly reminded of the "pick two" framing that applies to travel:

  • When you were young, you had the time and health to travel, but not the resources,
  • Now that you are a working adult, you have the health and resources, but not the time,
  • As a retiree, you may have the time and resources, but no longer the health.

This brings about a somewhat universal truth: there is never a good time to do just about anything. Or its Buddhist version: the past is already gone, the future is not yet here, there's only the present.

With this, I set out to search for flights. However, a key limitation in searching for complicated (that is, multi-stop) travel is that it is inflexible. This is due to the very nature of how flight price APIs are priced. The screenshots below show the user-facing state-of-the-art on kayak.com:

  • Left: Round-trip searches allow for flexible departure and return dates (typically, up to ± 3 days), and a search like "LON" includes the many airports London has to offer: Heathrow, Gatwick, City, Luton, and Stansted.
  • Right: Multi-city searches allow for more complicated travel (multiple stops) but are more inflexible — the dates and the order in which the stops are visited must be specified by the user.
Round-trip: flexible ± 3-day datesRound-trip: flexible ± 3-day datesMulti-city: more stops, but fixed dates and visiting orderMulti-city: more stops, but fixed dates and visiting order

In this post, we'll try to solve that: we'll build a truly flexible multi-stop flight search, only fixing the home airport. The destinations can be visited in any order (but must all be visited), and the dates are flexible. The user only specifies a minimum and maximum stay for each destination.

We'll fetch flight prices from the Travelpayouts API. While these flight prices are cached (they are not live), they give us a good indication of the best order and dates to visit the stops. We'll feed these prices into a mathematical optimization model which is an extension of the famous Traveling Salesman Problem (TSP). Finally, we solve the problem using the HiGHS solver.

To make this work very concrete, we solve two hopefully appealing "bucket list" problem instances:

  1. The top-10 most-visited cities in 2025, according to Euromonitor: Bangkok, Hong Kong, London, Macau, Istanbul, Dubai, Mecca (served by Jeddah, JED), Antalya, Paris, and Kuala Lumpur.
  2. The New 7 Wonders of the World: Petra, the Colosseum, Chichén Itzá, the Great Wall, Machu Picchu, the Taj Mahal, and Christ the Redeemer — with the Pyramids in honorary status.

To my surprise, even with higher-than-ever kerosene prices, the first can be booked (from/to San Francisco, SFO) for a total of $2,048, and the second for $3,479.

How the data was fetched

The source

Prices come from the Travelpayouts Aviasales Data API (free, token-based). The workhorse endpoint is v2/prices/month-matrix: given an origin, destination, and a target month, it returns the cheapest fare for each day of that month, grouped by the number of stops.

The route matrix

For a list of nn stops we fetch every directional pair (that's n(n1)n(n-1) of them) across the next 12 travel-months. For 10 stops that's 90 pairs × 12 months = 1,080 API calls per sweep. A small SQLite-backed client throttles requests to a polite ~3 requests per second, retries transient failures with exponential backoff, and logs and continues past any single bad call so one thin route can't abort the run.

Stop lists live in a config file as IATA codes. Where a city has a metropolitan code that aggregates all its airports (London LON, Paris PAR, Rome ROM, Beijing BJS, Rio RIO), we use it so the data covers the whole metro area rather than a single terminal.

Fixing the thin-route problem

The wonders list exposed the sparsity issue immediately: Agra (Taj Mahal) had essentially no departing fares and Mérida (Chichén Itzá) no arriving fares — you could not form a loop through them. So for that itinerary we swapped the four weakest-connected airports for their nearest major hubs:

WonderNearest airportSubstituted hub
Taj MahalAgra AGRDelhi DEL
Chichén ItzáMérida MIDCancún CUN
PetraAqaba AQJAmman AMM
Machu PicchuCusco CUZLima LIM

After re-collecting, every stop had inbound and outbound coverage and a complete tour became possible. We call this sweep wonders_hubs.

The mathematical program

The question — visit every stop once, stay a few nights at each, minimize total airfare, return home — is a TSP with time windows, and the date-dependent prices make it naturally time-indexed.

Sets and data

  • VV — the stops; hVh \in V is home (start and end); N=V{h}N = V \setminus \{h\}.
  • T={0,1,,H}T = \{0, 1, \dots, H\} — day offsets from the trip's start date; the calendar date of slot tt is start + t days.
  • cijtc_{ijt} — cheapest economy fare to fly iji \to j departing on day tt. If no fare exists for that pair/day, it is set to a large constant MM (big-M), so the model stays feasible and any leg forced over a gap is flagged afterward.
  • [aj,bj][a_j, b_j] — minimum / maximum nights of stay at city jNj \in N (we used 2–5).

Decision variables

xijt={1if we fly ij departing on day t,0otherwise,i,jV, ij, tT.x_{ijt} = \begin{cases} 1 & \text{if we fly } i \to j \text{ departing on day } t,\\ 0 & \text{otherwise,} \end{cases} \qquad i,j \in V,\ i \neq j,\ t \in T.

Objective

minijtTcijtxijt.\min \sum_{i \neq j}\sum_{t \in T} c_{ijt}\, x_{ijt}.

Constraints

Degree constraints. Each stop is left exactly once and entered exactly once:

jitxijt=1iV,ijtxijt=1jV.\sum_{j \neq i}\sum_{t} x_{ijt} = 1 \quad \forall i \in V, \qquad \sum_{i \neq j}\sum_{t} x_{ijt} = 1 \quad \forall j \in V.

Stay windows. Let the arrival- and departure-day of stop jj be

rj=ijttxijt,dj=kjttxjkt.r_j = \sum_{i \neq j}\sum_t t\, x_{ijt}, \qquad d_j = \sum_{k \neq j}\sum_t t\, x_{jkt}.

Then for every visited (non-home) stop:

ajdjrjbjjN.a_j \le d_j - r_j \le b_j \quad \forall j \in N.

Home carries no stay constraint — it's simply the trip's start and end.

The elegant part: subtours vanish for free

Classic TSP formulations need explicit (and numerous) subtour-elimination constraints to forbid disconnected sub-loops. Here, we get them for free from the stay windows.

Suppose a subtour existed among the non-home stops. Sum the lower stay bound djrjajd_j - r_j \ge a_j around that cycle. Because each flight is the departure of one city and the arrival of the next, the day-offsets telescope and (djrj)=0\sum (d_j - r_j) = 0 around any closed cycle. But the right-hand side is aj>0\sum a_j > 0 (every minimum stay is positive). That gives 0aj>00 \ge \sum a_j > 0 — a contradiction. So no proper subtour can exist, and since home is the one node without a stay constraint, the single feasible structure is one grand tour that starts and ends at home. No Miller–Tucker–Zemlin (MTZ)/Dantzig–Fulkerson–Johnson (DFJ) subtour elimination constraints required.

Solving it

The model is a binary program with O(V2T)\mathcal{O}(|V|^2 \cdot |T|) variables. A few practical lessons:

  • Solver. We modeled the problem using PuLP. For the solver, we started with CBC, then moved to HiGHS, which is markedly stronger on this structure — with symmetry detection and a higher heuristic effort it proves a bounded-window instance optimal in ~30 seconds.
  • Horizon scaling. Anchoring the window to the trip's own length (≈ the sum of the maximum stays, ~50 days) keeps the model small and solves to proven optimality fast. Letting the trip start anywhere in a full 365-day horizon blows the model up to ~40,000 binaries; the LP relaxation becomes loose.
  • The pragmatic answer: a per-month sweep. Instead of one giant full-year model, solve the small, fast, provably optimal bounded window once per candidate start month, then take the cheapest.

Solutions and interpretation

Both itineraries start and end at SFO, with 2–5 nights per city.

Top-10 most-visited cities — $2,048, departing June 8

The top-10 tour, in optimal visiting order. Click a city for details.

Tour: SFO → LON → AYT → IST → PAR → DXB → JED → KUL → MFM → BKK → HKG → SFO
Cost: $2,048

LegDepartFare
SFO → LondonJun 8$432
London → AntalyaJun 10$59
Antalya → IstanbulJun 12$30
Istanbul → ParisJun 14$110
Paris → DubaiJun 16$189
Dubai → JeddahJun 18$119
Jeddah → Kuala LumpurJun 21$213
Kuala Lumpur → MacauJun 25$147
Macau → BangkokJun 27$158
Bangkok → Hong KongJul 1$97
Hong Kong → SFOJul 5$494

New 7 Wonders + Giza — $3,479, departing June 10

The wonders tour, in optimal visiting order. Each city maps to a wonder — click for details.

Tour: SFO → CUN → AMM → CAI → DEL → BJS → ROM → LIM → RIO → SFO
Cost: $3,479

LegDepartFare
SFO → CancúnJun 10$191
Cancún → AmmanJun 12$775
Amman → CairoJun 14$171
Cairo → DelhiJun 16$227
Delhi → BeijingJun 18$269
Beijing → RomeJun 20$398
Rome → LimaJun 23$789
Lima → RioJun 25$268
Rio → SFOJun 30$391

Takeaways

  • A bucket-list trip is a TSP with time windows, and positive minimum stays make subtour elimination free — a nice modeling result.
  • For exact optimization, keep the time horizon tight and sweep over start dates rather than indexing a whole year at once — small proven solves beat one giant unprovable one.
  • The answers to this post's title question: $2,048 to see the world's 10 most-visited cities and $3,479 to see all the wonders, both leaving San Francisco in early June.

The full code is available in the GitHub repository fabianrigterink/bye-bye. A 2020 version of the same idea is in fabianrigterink/optimizing-multi-city-flights. Built with Python, SQLite, PuLP/HiGHS, and Plotly; data from the Travelpayouts Aviasales Data API.