Pyth Liquidity Oracles V1: Goodbye Uncertainty, Hello Precision

Pyth Network
15 min readFeb 21, 2023

New Year, New Oracle

We’ve blogged in the past about our ideas as to how lending protocols could benefit from a liquidity oracle to build countermeasures to illiquidity risks. We’re excited to springboard that into V1 of the Pyth Liquidity Oracle through a collaboration with one of our data providers, Kaiko!

This product should help consumer protocols mitigate the risk of excessively large positions on illiquid tokens being put on, which should reduce the likelihood of an illiquidity-centered exploit and bolster confidence in permissionless DeFi protocols more broadly.

V1 of the Liquidity Oracle will consist of:

  • Kaiko providing market impact estimates for tokens off-chain. See here for details on how to source Kaiko’s estimates.
  • Consumer protocols incorporating that data into on-chain risk parameters in their protocol logic.
  • Pyth providing consumer protocols an easy way to translate market impact-based risk parameters into adjusted valuation prices for large positions. See here for Pyth’s Rust SDK that introduces methodology to adjust prices based on liquidity in the market.

Liquidity information is a measure of the market impact of buying or selling a token. The market impact of a buy/sell is the difference between the original price before the buy/sell and the price after. If the liquidity is low relative to the amount bought/sold, the market impact will be high, since any significant market buy/sell will sweep the orderbook and cause liquidity providers to adjust their prices. Conversely, if the liquidity is high, then the market impact of the buy/sell will be low.

Protocols such as lending and money market protocols should use liquidity information to mitigate the risk of excessively large positions being put on in illiquid tokens. We describe some of the relevant illiquidity risks in more detail below. Currently, most protocols lack any safeguards around these types of risks, so having any solution to address these issues is an improvement over the status quo.

Despite the importance of the liquidity oracle, it is currently more difficult to engineer than a traditional price oracle. Notably, liquidity oracles face challenges from massive data volume to difficult aggregation complexities. Thus, we decided to develop the simplest version of a liquidity oracle — one that provides off-chain liquidity estimates for protocols to incorporate on-chain — to enable conservative and safe risk measures against illiquidity scenarios.

In this post, we describe the motivation and design of the Liquidity Oracle, provide detail on how Kaiko sources the data, and walk through some ways consumers could use these estimates to create risk parameters.

Motivation: an autumn of liquidity-based attacks

Recently, multiple protocols have suffered from liquidity-based attacks.

Mango Markets

In October, a group led by Avraham Eisenberg exploited the Mango Markets perpetual futures protocol. The group first took on a large position in the MNGO token perp futures contract. Next, it bought up millions of dollars of notional of MNGO on FTX and Ascendex, the main two sources of liquidity for the MNGO token, because it inferred that most data providers for oracles reporting the price of MNGO sourced their information from those two exchanges. Thus, by buying up a tremendous amount of MNGO on these exchanges, the group caused oracles to temporarily report higher prices for the token. Based on this, the reported value of their MNGO perp position on the Mango protocol grew by multiples, and they were then able to use this as collateral to borrow and withdraw a large amount of blue-chip tokens. These debt positions greatly outweighed in true value the cost of putting on the perp position and of buying up MNGO on the centralized exchanges, and so the team managed to extract around $100 million notional.

Notably, though the oracles pricing MNGO saw their quotes shoot up over a short period of time, this was not an oracle exploit. The price oracles did their job: reporting the live and accurate price of MNGO on its most liquid exchanges. What was missing was appropriate risk measures taken by the protocol to prevent its users from being able to deposit too much of an illiquid token that could see wild price swings — deliberate or unintentional — on trading venues.

What was missing: a limit on collateral deposits or collateral price movement at the protocol level for illiquid tokens.

Aave Curve Attack

Eisenberg followed the Mango attack with an attempted coup on Aave. Eisenberg attempted to exploit weaknesses of the AAVE protocol by depositing $63 million worth of USDC and borrowing $43 million worth of CRV. The borrowed CRV was reportedly sent to OKX, ostensibly to short the token and drive down its price. However, the price of CRV actually spiked during this time, which led Eisenberg’s vault to be liquidated.

The liquidation process, which required bots to repay Eisenberg’s CRV loan by selling USDC in exchange for CRV, took over 45 minutes to complete due to a lack of CRV liquidity on Ethereum DEXs. Without sufficient liquidation ability, liquidators were unable to liquidate the vault in full as the CRV price went up. This led to the accumulation of bad debt and the vault becoming insolvent. At the end of the incident, Aave was short 2.64 million CRV tokens, worth over $1.5 million, in this vault. If there had been more CRV liquidity on Ethereum DEXs, it is possible that liquidations could have been more efficiently processed and Aave would not have suffered any loss. However, given the lack of CRV liquidity, allowing such a large borrow position to be taken out put the protocol at risk of bad debt.

What was missing: a way to limit short squeeze potential, accounting for token liquidity

The two examples outlined here are by no means the only examples of liquidity-related attacks and failures. However, these two examples do a good job showcasing what is missing in risk mitigation tools. Both of these risk forms pose a threat to the protocol and its users. Both have the potential to leave the protocol holding a bag of bad debt. Both can be realized via malicious intent or unintentional mismanagement.

Protocols need to manage illiquidity risk, but…

… currently they’re not well-equipped to do so. Failing to manage illiquidity risk can cause a protocol to become insolvent, since an inability to handle a large illiquid token position can result in bad debt. Very little information exists on-chain about the liquidity of various tokens, and for protocols already focusing on other challenging problems, having to source and process liquidity data is another hurdle to overcome.

With some difficulty, protocols could source liquidity information from DEX state on-chain. Besides this involving bespoke parsing logic per protocol, for many tokens, DEX liquidity and volume are far lower than on CEXs. Bridging this off-chain information onto the blockchain is a natural fit for an existing oracle solution.

Design of Liquidity Oracle v1

Just as a price oracle abstracts away the price reporting problem from end consumers like lending and trading protocols, a liquidity oracle should do the same with regard to the illiquidity risk problem. What is needed are a high-quality source of liquidity data and a pipeline to help consumers use that data to appropriately value positions.

In that spirit, Pyth is launching an initial version of its Liquidity Oracle in partnership with Kaiko, one of the network’s data providers.

Kaiko is providing market depth information via their Web2 API. Protocols can analyze this information to compute appropriate liquidity estimates, which they can set as risk parameters in their protocol via governance.

Pyth is providing on-chain code in its SDK that uses these liquidity estimates along with Pyth prices to properly value positions.

In addition to using these out-of-box valuation techniques, protocols can explore their own methods that would work best for their use cases.

Importantly, we would like liquidity oracle data to meet three criteria:

  1. The liquidity estimates provided are treated as advisory, not necessarily as ground truth.
  2. Manual updating on the protocol side doesn’t need to happen too often.
  3. Data provided is used conservatively to stop potentially bad actions from taking place but doesn’t lead the protocol to make catastrophic decisions/contribute to an exploit.

Why off-chain?

There are two primary reasons our v1 Liquidity Oracle consists of off-chain provision of estimates as opposed to an on-chain stream:

  1. Liquidity is a more difficult concept to define than price and requires a degree of subjective judgment. This is because liquidity estimates attempt to predict how much market impact will occur in the future, even though that information is not empirically determinable in the present as resting orders can be removed by the time execution takes place. Moreover, there is greater uncertainty around how liquidity information should be used, in contrast to how price data is relatively well-understood in DeFi. Thus, we wanted to make room for human judgment. Off-chain data provision gives protocols more license to determine how to use the liquidity information.
  2. Because market depth and liquidity estimates generally do not change in terms of order of magnitude very quickly, and because protection against illiquidity risk relies more on the general level of liquidity rather than exact estimates, consumer protocols will likely not have to fiddle with this number too often. This means protocols will not need to update their risk parameters on chain at a frequency that would constitute poor UX.

Later versions may involve making the liquidity oracle available fully on-chain through a version of Pyth’s on-demand “pull” model.

What do the liquidity estimates look like?

Liquidity is a distribution, in the sense that a token’s liquidity is technically defined by the available quantity to be bought/sold at every possible price. For our intents we can approximately convey the distribution using a few key summary statistics, since the goal of a liquidity oracle is to compactly express this distribution to consumers. Liquidity Oracle V1 does that via Kaiko publishing summary statistics of the following form:

where ℓₓ is the quantity of tokens available to be bought/sold (note this is denominated in number of tokens, not dollar denominated) between the current price p and (1 + x) ⋅ p. As will be seen below, downward liquidity numbers (i.e. bid-side liquidity) express how much of the token could conceivably be sold in the open markets without making excessive market impact. The same is true with upward liquidity numbers (i.e. ask-side liquidity) with respect to how much of the token could conceivably be bought in the open markets without making excessive market impact.

Kaiko provides averages of {ℓₓ} over different time periods, e.g. 1 hr, 1 day, 1 week. This allows protocols to decide how regularly they want to monitor liquidity information and potentially update their risk measures. A very active protocol for example may wish to keep track of hourly averages and make more updates to their risk measures. Meanwhile, a less sensitive protocol may be happy to track longer averages on the order of days or weeks and update much more infrequently, in order to cut down on the logistical and gas costs of updates. Protocols could also choose to incorporate averages over different timeframes into their risk measures for robustness or other reasons.

Note that the liquidity estimates are denominated in terms of number of tokens, not in dollars. From a worst-case perspective, representing the liquidity estimates in terms of number of tokens is preferable to denominating in dollars: in the case of a crash, both the token’s price and its liquidity in terms of number of tokens available could fall. If the oracle were to denominate the liquidity estimates in terms of dollars, it would be vulnerable to a double whammy decrease in liquidity from the posted estimates. By denominating in tokens and allowing the user to incorporate the updated price, the oracle at least addresses one of the sources of potentially decreasing liquidity.

How Kaiko sources the data

Kaiko sources order book data from every active instrument listed on nearly 100 centralized exchanges. This encompasses the vast majority of crypto market activity, which is overwhelmingly concentrated on a handful of exchanges such as Binance, Coinbase, and Okex.

Kaiko collects two order book snapshots a minute for each instrument. For example, Binance lists 1,350 active markets, and Kaiko will take 2 order book snapshots a minute for each one, and automatically start collection for new instruments added.

All order book snapshots include bids and asks placed within 10% of the mid price. Thus, 10% market depth will comprise the total sum of bids and asks placed within 10% of the mid price. The same methodology applies to price ranges between .01–10%. All bids and asks are denominated in the base unit of the trading pair.

Kaiko’s order book data is then averaged across time intervals ranging from 1m to 1d. For example, daily 10% depth would take the average market depth for all order book snapshots taken during that time interval.

Kaiko’s instrument-level order book data can also be aggregated across multiple markets. For example, ETH market depth could take the sum of all bids/asks on ETH-USD, ETH-USDT, ETH-USDC, and ETH-BUSD order books. This would give a market-wide measure for ETH liquidity.

One thing to note is that not every exchange enables full order book collection, so for some exchanges, the depth collected will not encompass the full 10%. Thus, measures for market depth will be the maximum amount available within the 10% range.

To interact with Kaiko’s API to source these estimates, we refer you to this demo repo that contains Python code to get liquidity estimates and create helpful visualizations.

How consumers could build risk measures

Consumer protocols could use the provided estimates to create risk parameters, take emergency action, etc. We consider the case of a lending protocol that wishes to protect against issues with illiquid tokens.

There are three immediate ways that lending protocols can limit their risk surface:

1. Limit the number of tokens that can be deposited as collateral in the protocol — the limit can be set to the number of tokens that relate to a certain market depth or that, if sold, would induce a certain slippage.

This could protect a protocol from being stuck with bad debt in the case that the amount to be liquidated exceeded the buy-side liquidity on open markets. For limiting the number of tokens that can be deposited as collateral, the limit could be set in coordination with the liquidation penalty of the protocol. If the liquidation penalty were 5%, for instance, then the protocol might want to limit the number of tokens that can be deposited to be no more than the 5% market depth amount. Different protocols could adopt different market depth limits in line with their desired level of conservativeness.

2. Set the price at which the collateral is valued in line with slippage of the total deposited amount — instead of being valued at the oracle price, collateral tokens are valued at a discounted price. The discount increases as more of the collateral token is deposited or borrowed.

We propose the valuation price of the collateral to be a function v dependent on pᵢ, the initial discounted collateral valuation price (when there are 0 deposits), pᵥ, the final discounted collateral valuation price (possibly set in line with the liquidation penalty), D, the maximum quantity of tokens allowed to be deposited as collateral in the protocol, and d, the current quantity of tokens deposited as collateral in the protocol. One possible v is given below:

This is just a linear interpolation between the points (d = 0, pᵢ) and (d = D, pᵥ), i.e. between the initial collateral valuation price at 0 deposits and the collateral valuation price at maximum deposits. The liquidity information used to determine the maximum number of tokens that can be deposited into the protocol D sets this equation. pᵢ and pᵥ are functions of the current oracle price and define the initial and final discount, with pᵢ = pₜxᵢ, pᵥ = pₜ ⋅ xᵥ, xᵢ, xᵥ ≤ 1. To stay in line with the linear interpolation model, we would advise setting the final discount rate xᵥ to the ratio of the price point that has liquidity for D tokens to the current oracle price, and setting the initial discount rate xᵢ ∈ [xᵥ, 1].

This approach to valuing collateral would allow the collateral to be appropriately valued as it grew in size across the protocol, as opposed to the protocol assuming incorrectly that the collateral could be completely liquidated at the oracle price. However, this design could create poor UX, if borrowers were not aware that the valuation price of their collateral was variable depending on supply and unwittingly got liquidated; however, keeping xᵥ relatively close to 1 makes it so that the maximal drift is small, so a user would only get liquidated if they were anyways dangerously close to the liquidation point. Alternatively, for a simpler and more interpretable formula, a protocol could just choose to always value collateral at pᵥ, which would not vary with the amount of collateral deposited in the protocol and would constitute a conservative max discounted price. This would sacrifice some capital efficiency but possibly improve the UX.

The implementation of this method for valuing collateral is found here, and an analogous method for valuing borrow positions is found here.

3. Limit the number of tokens that can be borrowed from the protocol — the limit can be set to the number of tokens that relate to a certain market depth on the ask side.

This risk measure would effectively prevent a short squeeze a la Eisenberg’s attempt on Aave, by preventing a user from taking out a huge loan on an illiquid token. It should be noted that the limit on the collateral deposits should likely in practice prevent any short squeeze in the first place, but in theory, if the bid-side liquidity is much larger than the ask-side liquidity, then a short squeeze attack could still be possible. This is because the bid-side liquidity informs how much of collateral a liquidator can profitably take on and liquidate, while the ask-side liquidity informs how much of a borrowed token a liquidator can profitably purchase to make the repayment. The limit here could be set in coordination with the risk tolerance and conservativeness of the protocol.

Consumers could store either the liquidity estimates or their derived risk parameters in their contracts and alter them whenever they became stale. This could be done via governance or other mechanisms that are up to the discretion of the protocol.

Aave V3 has already implemented supply and borrow caps in line with illiquidity-centered thinking. For each pool, those caps are configurable via governance. However, Aave’s limits are mostly informed by extended risk modelling discussions; in contrast, the Pyth Liquidity Oracle provides consumers with a sensible set of values they can use to inform their supply and borrow caps.

Potential improvements

We have thought about additional improvements that could be made to these first-order safety rails. For instance, instead of naively constraining the protocol’s collateral to a target market depth/slippage level, a sophisticated protocol could try to ensure that for any given price range, no more than that target market depth/slippage amount can be liquidated. However, this is extremely difficult to ensure with today’s lending protocol tech stack. We leave this potential enhancement to future design iterations.

Another idea is to eliminate the deposit limit and value all collateral accounting for slippage of the overall collateral position in the protocol. This is unfortunately subject to the malicious attack whereby the attacker deposits a lot of a token to liquidate other vaults using that token as collateral. Any attempt to generalize the valuation price formula beyond a relatively tight deposit limit would need to address this attack vector.

One other improvement could involve limiting the use of prices that are substantially greater than other recently used prices. There are some substantial issues with limiting price movements in general; one can imagine that, for a lending protocol, limiting the downward movement of prices could enable severe consequences, such as in the case of a token’s legitimate crash. However, preventing a user from using rapidly inflated prices could safeguard the protocol against price manipulation attacks (a la Mango). Of course, there are many complicating factors in the design. Importantly, the protocol would want to prevent using rapidly upward moving prices for collateral assets in particular and not make any such limitations for debt assets, since that could pose a risk to the protocol’s health. Moreover, one can envision more complex multicollateral situations where such limitations could cause unfair liquidations. The implementation details around how to achieve this are also nontrivial; we leave the design and implementation of any such feature to future efforts.

What’s next?

As noted above, in the long run we may create an aggregation mechanism for liquidity information and move the module on-chain onto Pythnet. From there, information could be streamed onto target chains via the on-demand model — similar to price information, although likely at lower frequencies.

Designing an aggregation mechanism and handling the huge volume of liquidity distribution data for many symbols are nontrivial; we have discussed these previously. We also hope to work with the community to develop more sophisticated risk parameters for generalized use cases. But these are challenges we hope to tackle and resolve, for DeFi certainly needs accurate liquidity information to see real growth.

In the immediate future, we hope and encourage that consumer protocols leverage v1 of the Liquidity Oracle to protect against illiquidity risks, and we welcome any thoughts from community members as we continue to iterate.

We can’t wait to hear what you think! You can join the Pyth Discord and Telegram, follow us on Twitter, and be the first to hear about what’s new in the Pyth ecosystem through our newsletter. You can also learn more about Pyth here.



Pyth Network

Smarter data for smarter contracts. Pyth is designed to bring real-world data on-chain on a sub-second timescale.