In our previous article, we established a strong case for interregional trade in EvE Online after conducting a study on macroeconomic trade dynamics and reviewing relevant game mechanics.
In this article, we will incorporate everything we’ve learned so far in-order to:
- Set clear reporting goals for interregional trade
- Identify important reporting strategies
- Write a comprehensive script, borrowing code developed in past exercises
Setting Goals
The main takeaway from our last article is that while prices outside of ‘The Forge’ are only nominally more expensive due to freight costs, they are still markedly more volatile. With greater volatility comes opportunities where freight-adjusted prices between ‘The Forge’ and secondary trade regions may be substantially different, providing potential for profit. Our goal therefore will be identifying items which can be profitably exported from ‘The Forge’ and listed for sale in other regions, accounting for freight costs and taxes/fees.
Exporting goods from ‘The Forge’ will be the focus of our reporting because it is the approach best supported by the research conducted in the previous article. While it is definitely possible to report against the inverse, there is little point doing so; acquiring and exporting goods from secondary regions to be imported and sold at ‘The Forge’ is a losing strategy both because ‘The Forge’ features lower nominal prices and because freight costs are incurred regardless of route direction.
Defining Strategies
Before we dive into code examples, let’s consider this goal in the context of past reports we’ve developed.
At face-value, this goal isn’t very different from station trading on the bid-ask spread. Instead of buying low and selling high within a single region, we now want to buy low and sell high between regions.
While it might be tempting to directly convert the entirety of our station trading script into an interregional trading script, the introduction of geography as a variable forces us to take more matters into account which will change our approach.
Freight Deltas
Unlike station trading or reprocessing, interregional trade involves the movement of goods between regions. While we discussed hauling and its associated costs at-length in the previous article, we did not discuss the time it takes to get goods from point A to point B, which we’ll refer to as the freight delta.
Since price and volume change over-time, and since we intend to predict these changes using linear regression, we need to somehow incorporate freight deltas into our reporting schema.
So first, how do we obtain freight deltas? The first place to look are in the game mechanics. Real freight times are a function of a lot of different esoteric variables including ship attributes and route composition. With an extremely fast hauling vessel, it takes approximately 10 minutes to travel between Jita and a far-away system like Amarr. Much larger freighters might require about an hour to travel the same distance. Shorter routes take proportionally less time to complete, and while it is possible to shave more time by taking risky shortcuts through lower-security space, the saved time is seldom worth the additional risk.
While this is great information to know if we are hauling goods ourselves, many traders will forego the burden of hauling and opt instead to contract this work out to third-parties. Since we will be using Red Frog Freight quotes to serve as our baseline for freight costs, we should also consider their freight deltas. In a 2023 reddit post, a representative from Red Frog Freight shared hauling statistics, which included an average issued-to-completion duration of 19.03 hours for all accepted private contracts.
This figure not only serves as an excellent apples-to-apples baseline for determining our freight delta, when juxtaposed with the reality of in-game hauling mechanics it also informs us that the act of hauling itself contributes negligibly toward total freight deltas, with goods under-contract on even the longest routes spending at least 95% of their time sitting idle at stations. As a result, we can directly use this 19 hour figure as a static freight delta for all routes without introducing significant error. With more detailed information about contract routes and/or collateral values we could attempt to more accurately ascertain freight deltas in our reporting, however for now this static value is the best data point we have at our disposal.
It is important to keep in-mind that traders who haul their own goods can benefit from having greater control over their freight deltas. For instance, if a trader identifies a commodity which is quickly selling-out in a specific market, they can more rapidly restock goods than traders who are otherwise dependent on contractors for freight.
Originating vs. Clearing Orders
Since station trading is entirely dependent on the bid-ask spread, it is crucial for traders engaged in that strategy to originate all of their own orders and patiently wait for other traders to clear them. Failure to do this will almost always result in a loss, since clearing orders sacrifices the spread and any derived profit in-exchange for convenience.
In our last article, we made the important discovery that the region ‘The Forge’ comprises 85% of all transactions by ISK (Dollar) volume across the entire game. Since the majority of trade activity occurs within this region, competition is high, resulting in low volatility and the extreme tightening of real price spreads, which we calculated to be a tiny (~0.05%) fraction of total ISK volume.
In a vacuum, the efficiency presented by ‘The Forge’ is a pain point for station traders as it represents an erosion of potential profit. However, by introducing the concept of geography, this market efficiency becomes an important asset with cascading benefits:
- Traders can acquire or liquidate goods at ‘The Forge’ at relatively consistent prices
- Clearing existing orders at ‘The Forge’ may actually be less expensive than originating a new order if the bid-ask spread is smaller than associated brokerage fees
- Since clearing orders has a transaction delta of zero, goods can be acquired, moved, and liquidated more rapidly within an interregional trade cycle, potentially improving profit as a function of time.
- Since cleared orders do not need to sit open on the market like originated orders do, there is less administrative overhead and characters are free to allocate their open order slots toward other trade opportunities.
Despite being completely incompatible with station trading, instantly acquiring assets at ‘The Forge’ through the clearing of existing orders is the optimal strategy when engaging in interregional trade. However, it is prudent to check the prevailing bid-ask spread on a per-commodity basis before clearing orders to validate that we aren’t being ripped-off, as spreads for individual commodities can still vary widely even in ‘The Forge’ where they are otherwise tight in-aggregate.
Trade Volume
When generating a report, it is important to identify potential trade bottlenecks. In the case of our report strategy involving reprocessing, we specifically sought open market orders which could saturate both the supply and demand sides of at least one full reprocessing cycle. This logic extends to interregional trade, where it is important to ensure that goods are flowing in-proportion to what a source region can supply and what a destination region demands. For example, it is pointless acquiring a lot of goods in one region at a relative discount if that volume of goods cannot be liquidated elsewhere. It is likewise pointless attempting to meet demand in circumstances where supply cannot be sourced.
The market qualities of ‘The Forge’ somewhat simplifies our consideration of trade bottlenecks. Since this region totally eclipses all other regions in-terms of trade volume while also maintaining healthy trade balance, it represents a market where it is practically impossible to fully saturate demand or completely drain supply through interregional trade alone. We thus can safely preclude ‘The Forge’ as a potential bottleneck when determining shipment quantities, instead focusing in our case solely on what is demanded by secondary regions.
Strategy Summary
Despite these new considerations, with some modification we should be able to re-use the majority of our station trading script for the purpose of interregional trade. Some key changes will include:
- Adding a freight delta of 19 hours to all data modeling functions, accounting for the average amount of time it takes goods to arrive at their destination. This will allow the values we pull from our linear regressions to account for anticipated changes in the market over time while goods are under courier contracts.
- Simplifying our utilization of any data involving ‘The Forge’. Since we will be purchasing commodities directly from the market at currently-posted prices, all we need to do is identify prices for each type of commodity at the top and bottom of their respective spreads. There is otherwise no need to perform regression analysis or anything remotely complex with that region’s data.
- Weighing our report results entirely against demand presented by the importing region. Because it is an extremely huge, competitive and efficient market, we can safely assume for the most part that ‘The Forge’ can supply any desired quantity of export goods demanded by any other region. As a result, we can interpret trade values such as historical price and volume trends for our importing regions without fear of being constrained by supply-side bottlenecks.
Great; let’s get started with some code!
Writing our Script
Since we’ll be primarily focusing on code modifications rather than the original code itself, I would strongly recommend reviewing the relevant section from the past article on station trading before proceeding.
Let’s start with our modules and global variables:
### modules
import sqlite3
import pandas as pd
import os
from datetime import date, datetime, timedelta, timezone
import numpy as np
import statistics
import time
import pytz
from sklearn.linear_model import LinearRegression
### define regions to report against and their associated freight costs as a percentage of 'The Forge' pricing
regionlist = [['Domain', .0393], ['Essence', .0129], ['Metropolis', .0185], ['SinqLaison', .0153], ['Heimatar', .0233], ['TashMurkon', .0409]]
### define database location where source data is stored
databaselocation = 'marketdata.sqlite'
### change working directory to script location
pathtorunfile = os.path.dirname(__file__)
os.chdir(pathtorunfile)
### specify in days the span of time used to build our linear regressions
HistoryRegressionSpan = 30
OrderbookRegressionSpan = 1
### specify maximum transaction delta (days). Type_ids will be filtered from our results if the time to clear a Buy and Sell order exceeds this value
MaxTransactionDays = 7
### specify minimum ROI (%). Type_ids will be filtered from our results if ROI falls below this percentage
MinROI = 10
### specify minimum ISK profit per day. Type_ids will be filtered from our results in daily potential profit falls below this quantity
MinProfitPerDay = 1000000
### specify amount of time freight/shipments take to execute. by-default this is 19 hours (68400 seconds) per Red Frog Freight's documented statistics.
FreightDelta = 68400
This is largely identical to our station trading script, with some key changes including the addition of the freight cost ratios discovered in our last article, as well as the introduction of our new freight delta.
Next, all of the functions we’ve used to facilitate the calculation of linear regressions will remain unchanged, including ‘lrmodel’, ‘lrstaging’ and ‘lrstagingIncludeNull’. This goes the same for our ‘miscstaging’ function which facilitates feeding data into our modeling functions. In the future, it might be a fun exercise transposing this functionality into R, but for now we’ll stick to what we know works.
Let’s jump now to the start of our runtime:
### runtime
if __name__ == '__main__':
### database initialization
## create in-memory database to temporarily store tables used for report generation
tempdb = sqlite3.connect(":memory:")
cur = tempdb.cursor()
## attach primary database (READ-ONLY)
cur.execute(f"ATTACH 'file:{databaselocation}' AS marketdatadb;")
### generate table of timestamps and corresponding dates indicated by HistoryRegressionSpan variable, which will be used later for regression analysis
## extract unix time from historical data for 'The Forge'
HistoryUnixTime = pd.read_sql(f"SELECT max(timestamp) as timestamp from TheForgeHistory;", tempdb)
HistoryUnixTime = int((HistoryUnixTime['timestamp'].to_list())[0])
## populate HistorySpanDates with dates indicated by HistoryRegressionSpan
cur.execute(f"CREATE TABLE TheForgeHistorySpanDates AS SELECT DISTINCT date FROM marketdatadb.TheForgeHistory ORDER BY date DESC LIMIT {HistoryRegressionSpan};")
## add unix timestamps to HistorySpanDates, which will later serve as the X axis for our linear regressions
HistorySpanDates = pd.read_sql(f"SELECT date FROM TheForgeHistorySpanDates", tempdb)
HistorySpanDates = list(HistorySpanDates.date)
DateTimestampList = []
DateTimestamp = HistoryUnixTime
for HistoryDate in HistorySpanDates:
DateTimestampList.append([HistoryDate, DateTimestamp])
DateTimestamp = DateTimestamp - 86400
DateTimestampList = pd.DataFrame(DateTimestampList, columns = ['date', 'timestamp'])
DateTimestampList.to_sql(f"HistorySpanDates", tempdb, if_exists="replace", index=True)
### prepare orderbook data for the region 'The Forge', as well as some static data
## extract unix time from order book data for 'The Forge'
OrderbookUnixTime = pd.read_sql(f"SELECT DISTINCT max(timestamp) AS maxtimestamp FROM TheForge;", tempdb)
OrderbookUnixTime = int((OrderbookUnixTime['maxtimestamp'].to_list())[0])
## extract current lowest sell orders from 'The Forge' order book
cur.execute(f"CREATE TABLE JitaMinSell AS SELECT type_id, min(price) AS price FROM TheForge WHERE is_buy_order = 0 AND timestamp = {OrderbookUnixTime} GROUP BY type_id;")
JitaMinSell = pd.read_sql("SELECT * FROM JitaMinSell", tempdb)
## extract current highest buy orders from 'The Forge' order book
cur.execute(f"CREATE TABLE JitaMaxBuy AS SELECT type_id, max(price) AS price FROM TheForge WHERE is_buy_order = 1 AND timestamp = {OrderbookUnixTime} GROUP BY type_id;")
JitaMaxBuy = pd.read_sql("SELECT * FROM JitaMaxBuy", tempdb)
## extract static data from InvTypes table, including English names and volume attribute for each in-game object.
TypeID_names = pd.read_sql("SELECT typeID as type_id, typeName AS name, volume as m3 FROM InvTypes;", tempdb)
In-summary, these blocks handle database initialization, date/timestamp normalization, preparation of ‘The Forge’ orderbook data, as well as preparation of static data. While this is mostly adapted straight from our station trading script, there are some key important differences:
- Since we are comparing all regions against ‘The Forge’, we only need to load data for ‘The Forge’ once. As a result, these blocks are now executed before our per-region for loop.
- The scope of data being loaded from ‘The Forge’ is now limited to the most recent date indicated by the historical data endpoint, as well as the maximum Buy and minimum Sell orders presented by the current order book.
- In-addition to English type names, we are now including m3 from our InvTypes static data. This will be used later when generating our report output to provide the physical size of trade goods.
The values generated by these first blocks will be referenced in various ways within our per-region for loop, which we’ll dive into next:
### iterate over each region and generate a report
for region in regionlist:
### this block serves as a quick first pass filter, using orderbook data to eliminate type_ids with little/no direct profit potential before we move-on to regression analysis.
## extract unix time from order book data
OrderbookUnixTime = pd.read_sql(f"SELECT DISTINCT max(timestamp) AS maxtimestamp FROM {region[0]};", tempdb)
OrderbookUnixTime = int((OrderbookUnixTime['maxtimestamp'].to_list())[0])
## generate a table containing all unique type_ids encountered at the current region for the number of days indicated by HistoryRegressionSpan, along with their corresponding minimum dates
cur.execute(f"CREATE TABLE TypeIDMinDates AS SELECT type_id, min(date) AS MinDate FROM {region[0]}History GROUP BY type_id;")
## extract lowest sell orders from current region order book
cur.execute(f"CREATE TABLE MinSell AS SELECT type_id, min(price) AS price FROM {region[0]} WHERE is_buy_order = 0 AND timestamp = {OrderbookUnixTime} GROUP BY type_id;")
## left join on TypeIDMinDates, filtering-out any type_ids which haven't traded in the amount of days indicated by HistoryRegressionSpan
cur.execute(f"CREATE TABLE TypeIDs AS SELECT * FROM TypeIDMinDates LEFT JOIN MinSell ON MinSell.type_id = TypeIDMinDates.type_id;")
## create a list of type_ids to filter. This includes all items which cannot presently be profitably traded between 'The Forge' and the indicated region given current order book data. This accounts for taxes, fees, freight costs, and minimum ROI. Type_ids without existing orders will be spared for now.
cur.execute(f"CREATE TABLE TypeIDsFiltered AS SELECT TypeIDs.type_id AS type_id FROM TypeIDs, JitaMinSell WHERE TypeIDs.type_id = JitaMinSell.type_id AND JitaMinSell.price * (1 + {float(MinROI/100) + region[1]} + 0.051) > TypeIDs.price;")
## update TypeIDs table, purging unprofitable items
cur.execute(f"DELETE FROM TypeIDs WHERE type_id IN (SELECT type_id FROM TypeIDsFiltered);")
This block is a new addition and serves as our first filtration stage, eliminating commodities featuring obviously nonexistent profit potential. For instance, if prices at ‘Domain’ for a specific commodity are currently lower than prices at ‘The Forge’, the item can be safely eliminated from all subsequent analysis. Filtering these items early in our process is ideal, since arithmetic operations using SQLite or Pandas are far less computationally expensive than performing linear regression. During my testing with MinROI set to 10(%), this cuts our list of commodities in half.
Next, we move into our data normalization block:
### continue data normalization procedure before getting into regression analysis
## generate a cartesian product of all unique type_ids and their span of dates greater than MinDate. Dates less than MinDate are omitted to remove bias against items which have been very recently introduced into the game. Likewise, since we captured MinDate for all history, we're preventing bias favoring items which have been in the game for a long time and were traded recently, but which are otherwise seldomly traded.
cur.execute(f"CREATE TABLE TypeIDAllDates AS SELECT type_id, MinDate, date, timestamp FROM TypeIDs CROSS JOIN HistorySpanDates WHERE HistorySpanDates.date >= TypeIDs.MinDate;")
## populate HistorySpan with data corresponding with the dates indicated by HistorySpanDates
cur.execute(f"CREATE TABLE HistorySpan AS SELECT average, {region[0]}History.date AS date, highest, lowest, volume, order_count, type_id FROM {region[0]}History, HistorySpanDates WHERE {region[0]}History.date = HistorySpanDates.date;")
## left join our HistorySpan table to the TypeIDdates cartesian product. Dates greater than MinDate containing no data will carry null values for now.
cur.execute(f"CREATE TABLE HistorySpanAllDates AS SELECT TypeIDAllDates.type_id AS type_id, TypeIDAllDates.date AS date, TypeIDAllDates.timestamp AS timestamp, average, highest, lowest, volume, order_count, MinDate FROM TypeIDAllDates LEFT JOIN HistorySpan ON TypeIDAllDates.type_id = HistorySpan.type_id AND TypeIDAllDates.date = HistorySpan.date ORDER BY TypeIDAllDates.type_id;")
## extrapolate BuyVolume, BuyOrders, SellVolume and SellOrders fields by checking the relative distance between our highest/average/lowest prices. We cannot infer the direction of bias in circumstances where highest = lowest, so in these circumstances volume is counted as BiasedVolume
cur.executescript(f"DROP TABLE IF EXISTS HistorySpan; ALTER TABLE HistorySpanAllDates RENAME TO HistorySpan; ALTER TABLE HistorySpan ADD COLUMN BiasedVolume; ALTER TABLE HistorySpan ADD COLUMN BuyVolume; ALTER TABLE HistorySpan ADD COLUMN SellVolume; ALTER TABLE HistorySpan ADD COLUMN BuyOrders; ALTER TABLE HistorySpan ADD COLUMN SellOrders;")
cur.execute(f"UPDATE HistorySpan SET BuyVolume = volume * (highest - average) / (highest - lowest), BuyOrders = order_count * (highest - average) / (highest- lowest), SellVolume = volume * (average - lowest) / (highest - lowest), SellOrders = order_count * (average - lowest) / (highest - lowest) WHERE (highest > average OR average > lowest);")
cur.execute(f"UPDATE HistorySpan SET BiasedVolume = volume WHERE highest = lowest;")
## infill any volume-related NULL values with 0
cur.executescript(f"UPDATE HistorySpan SET BiasedVolume = 0 WHERE BiasedVolume IS NULL; UPDATE HistorySpan SET BuyVolume = 0 WHERE BuyVolume IS NULL; UPDATE HistorySpan SET SellVolume = 0 WHERE SellVolume IS NULL; UPDATE HistorySpan SET BuyOrders = 0 WHERE BuyOrders IS NULL; UPDATE HistorySpan SET SellOrders = 0 WHERE SellOrders IS NULL; UPDATE HistorySpan SET order_count = 0 WHERE order_count IS NULL;")
## create HistorySpanSum table containing sums for all of our volume types
cur.execute(f"CREATE TABLE HistorySpanSum AS SELECT type_id, sum(BiasedVolume) AS BiasedVolumeSum, sum(BuyVolume) AS BuyVolumeSum, sum(SellVolume) AS SellVolumeSum FROM HistorySpan GROUP BY type_id;")
## purge any type_ids which are 100% biased toward Buy orders (this trade strategy only involves interacting with Sell orders)
cur.execute(f"DELETE FROM HistorySpanSum WHERE SellVolumeSum = 0;")
## calculate the ratio between buying and selling activity for all days where bias can be inferred.
cur.executescript(f"ALTER TABLE HistorySpanSum ADD COLUMN BuySellRatio; UPDATE HistorySpanSum SET BuySellRatio = BuyVolumeSum / (BuyVolumeSum + SellVolumeSum);")
## use BuySellRatio to extrapolate volume and order count for all days where bias could not be inferred. The result is our normalized set of historical market data
cur.execute(f"CREATE TABLE HistorySpanNrml AS SELECT BuySellRatio, HistorySpanSum.type_id AS type_id, date, timestamp, average, highest, lowest, volume, order_count, Mindate, BiasedVolume, BuyVolume, SellVolume, BuyOrders, SellOrders FROM HistorySpanSum LEFT JOIN HistorySpan ON HistorySpanSum.type_id = HistorySpan.type_id;")
cur.executescript(f"DROP TABLE HistorySpan; DROP TABLE HistorySpanSum;")
cur.executescript(f"UPDATE HistorySpanNrml SET BuyVolume = BuySellRatio * BiasedVolume WHERE BiasedVolume > 0; UPDATE HistorySpanNrml SET BuyOrders = BuySellRatio * (BiasedVolume / order_count) WHERE BiasedVolume > 0; UPDATE HistorySpanNrml SET SellVolume = (1 - BuySellRatio) * BiasedVolume WHERE BiasedVolume > 0; UPDATE HistorySpanNrml SET SellOrders = (1 - BuySellRatio) * (BiasedVolume / order_count) WHERE BiasedVolume > 0;")
This block is largely identical to our station trading data normalization routine, with a key difference involving the handling of Buy vs Sell order bias. Since interregional trade strictly involves clearing and placing Sell orders, we only want to eliminate commodities from our analysis which appear to feature zero Sell order activity. This block otherwise provides a set of normalized historical endpoint data for the indicated region which will be used for subsequent data modeling.
The next runtime block will be presented along with its associated data modeling function:
...
## calculate and return and Sellorder deltas
def sdeltamodel(TypeID, InputData):
# calculating the expected frequency of Sell orders, including freight time per RFF's posted statistics
SellOrderFreq = InputData.loc[InputData["type_id"] == TypeID, ['SellOrders']].values[0][0].predict([[OrderbookUnixTime + FreightDelta]])[0][0]
# If the frequency is greater than zero, dividing 1 by the frequency will give us the current number of days it will take for a Sell order to be executed, converted into a unixtime delta by multiplying by the number of seconds in a day.
if SellOrderFreq > 0:
SellOrderDelta = (1 / (SellOrderFreq)) * 86400
# Since order frequency changes over-time according to the slope of the regression, we also need to take this change into account when calculating the SellOrder delta. We can do this by taking the average between our frequency now vs the expected frequency in the future. (This isn't a perfect mechanism, but provides a better approximation of our anticipated sell order delta than simply using the result from the current timestamp and freight delta)
FutureSellOrderFreq = InputData.loc[InputData["type_id"] == TypeID, ['SellOrders']].values[0][0].predict([[OrderbookUnixTime + SellOrderDelta + FreightDelta]])[0][0]
else:
return('none')
if FutureSellOrderFreq > 0:
SellOrderDelta = (2 / (SellOrderFreq + FutureSellOrderFreq)) * 86400
# If SellOrderDelta is within our specified MaxtransactionDays, it will be returned to the parent thread.
if SellOrderDelta < MaxTransactionDays * 86400:
return[[TypeID, SellOrderDelta]]
return('none')
...
## runtime
...
### second pass: run regression analysis to eliminate type_ids where freight + order duration exceeds MaxTransactionDays.
## save current set of unique type_ids as a dataframe and list from HistorySpanNrml
TypeID_df = pd.read_sql("SELECT DISTINCT type_id FROM HistorySpanNrml", tempdb)
TypeID_list = list(TypeID_df.type_id)
## save all of our normalized data as a HistoryData dataframe from HistorySpanNrml
HistoryData = pd.read_sql("SELECT * FROM HistorySpanNrml;", tempdb)
## calculate regression slopes for SellOrders, save results to regressions dataframe
regressions = lrstagingIncludeNull(HistoryData, 'SellOrders')
## feed TypeID_list and regressions data into the Buy/Sell Delta model for data processing
ResultData = pd.DataFrame((miscstaging(TypeID_list, regressions, sdeltamodel)), columns=['type_id', 'SellOrderDelta'])
# break current loop iteration and export null result if there is no data left to process
if ResultData.empty:
sqlitecleanup()
ResultData = nullresult()
resultexport(ResultData)
continue
## refresh TypeID list and purge unwanted TypeIDs from HistoryData and regressions
TypeID_df = pd.DataFrame(ResultData.type_id.unique(), columns=['type_id'])
TypeID_list = list(TypeID_df.type_id)
HistoryData = pd.merge(HistoryData, TypeID_df, how="right", on=["type_id", "type_id"])
regressions = pd.merge(regressions, TypeID_df, how="right", on=["type_id", "type_id"])
This block uses the sdeltamodel function to identify and filter items which do not trade frequently enough to satisfy our defined “MaxTransactionDays” value, which is currently set to 7 days.
Our runtime block is largely identical to the same section featured in the station trading script, with a key difference being that we are now only calculating regression slopes for SellOrders, with BuyOrders being omitted entirely.
These regression slopes are then fed into sdeltamodel, which is directly adapted from the bsdeltamodel function featured in the station trading script. This modified function now omits any reference to Buy order frequency, which has effectively been replaced by our static FreightDelta figure. Since our interregional trading strategy involves purchasing commodities directly from existing Sell orders, there is no need to calculate Buy order deltas at all.
This pattern of removing BuyOrders variables and instead inserting our freight delta continues as we work through the following four runtime blocks and their associated data modeling functions, which include the modeling of historical price, potential profit, ‘issued’ time (competition), and orderbook price. For those interested in reviewing these in more detail, a link to the full script will be provided towards the end of the article.
Once all of these blocks are finished with their regression analysis, modeling, and filtering, we are left with a ResultData pandas dataframe containing the subset of commodities possessing the greatest profit potential when imported at the indicated region. At this stage, we now need to appropriately format the data and prepare it for export:
...
### perform some data formatting, injection, and arithmetic before export
# attach english item names and m3 fom InvTypes
ResultData = pd.merge(ResultData, TypeID_names, how="inner", on=["type_id", "type_id"])
ResultData = pd.merge(ResultData, JitaMaxBuy, how="inner", on=["type_id", "type_id"])
ResultData = ResultData.rename(columns={'price': 'MaxBuyPrice', 'PotentialDailyProfit': 'PDP', 'SellRelistCount': 'RelistCt'})
# misc arithmetic for end-user consumption.
ResultData['TotalDelta'] = (FreightDelta + ResultData['SellOrderDelta']) / 60
ResultData['m3PerOrder'] = ResultData['m3'] * ResultData['VolPerOrder']
ResultData['pctBuyback'] = ResultData['MaxBuyPrice'] * 100 / ResultData['BuyPrice']
# sort data in our desired sequence
ResultData = ResultData[['name', 'type_id', 'PDP', 'pctROI', 'pctBuyback', 'BuyPrice', 'SellPrice', 'VolPerOrder', 'm3PerOrder', 'TotalDelta', 'RelistCt']]
ResultData = ResultData.sort_values(by='PDP', ascending=False)
# convert all types to integer
ResultData['PDP'] = np.floor(pd.to_numeric(ResultData['PDP'], errors='coerce')).astype('Int64')
ResultData['pctROI'] = np.floor(pd.to_numeric(ResultData['pctROI'], errors='coerce')).astype('Int64')
ResultData['pctBuyback'] = np.floor(pd.to_numeric(ResultData['pctBuyback'], errors='coerce')).astype('Int64')
ResultData['BuyPrice'] = np.floor(pd.to_numeric(ResultData['BuyPrice'], errors='coerce')).astype('Int64')
ResultData['SellPrice'] = np.floor(pd.to_numeric(ResultData['SellPrice'], errors='coerce')).astype('Int64')
ResultData['VolPerOrder'] = np.floor(pd.to_numeric(ResultData['VolPerOrder'], errors='coerce')).astype('Int64')
ResultData['m3PerOrder'] = np.floor(pd.to_numeric(ResultData['m3PerOrder'], errors='coerce')).astype('Int64')
ResultData['TotalDelta'] = np.floor(pd.to_numeric(ResultData['TotalDelta'], errors='coerce')).astype('Int64')
ResultData['RelistCt'] = np.floor(pd.to_numeric(ResultData['RelistCt'], errors='coerce')).astype('Int64')
resultexport(ResultData)
### cleanup temp sqlite db before moving-on to next region
sqlitecleanup()
Some changes and additions from our station trading script include:
- PDP, which is simply an initialism for Potential Daily Profit
- pctBuyback, which is the percentage of the lowest Sell order price represented by the highest Buy order price at ‘The Forge’. This provides traders with a gauge of market risk, as it represents the percentage of the original purchase price a player can expect to receive when selling the item back to buyers at ‘The Forge’ if the item somehow fails to sell abroad. Percentages closer to 100 indicate fairer deals, while percentages much lower than this indicate that caution should be exercised before buying, especially for expensive commodities with poor ROI.
- *This can occasionally report higher than 100% in circumstances where offers are posted in different systems within the same region which bisect in their listed price
- m3PerOrder is the total volume (in cubic meters) taken-up by the proposed quantity of goods. This information is useful to know if traders who haul their own goods face cargo bay constraints and are limited in the volume they can haul at once.
The resulting dataframe is then sent to our resultexport() function, allowing the output for each region to either be exported to .csv, or to another database for subsequent publishing.
The complete script is available for download here, with ‘Imports’ reports for each region now published and refreshed hourly on our Projects page.
Conclusion
By combining the macroeconomic lessons learned from our past article with the advanced techniques developed for our station trading script, we were able to rapidly deploy a new report through the re-utilization of existing code. While there certainly are many improvements and refactoring avenues which could be taken to improve readability and execution speed, for now we have a fully functional product which achieves our goal.
Our next logical step would be in-game testing. Fortunately, much of this legwork was already performed when writing the station trading report. The linear regression and data modeling strategies which resulted from that testing were easily translated into our new report after accounting for the nuances of interregional trade. This again demonstrates the value presented by re-utilizing known and tested code; while there are circumstances where re-writing a solution can be justified, oftentimes it is better to re-utilize something that we know works.
While I have ideas for subsequent articles which I may revisit in the future, for now this marks the tentative end of my series on Virtual Markets. If you’ve stuck around this far, I would like to thank you for your perseverence!
I’m always open to suggestions or ideas for future projects; feel free to reach-out through Contact page.
Thank you for reading!