What are you missing? Using basic machine learning to predict and recommend NFTs with OpenSea data

by Takens Theorem

Recommender systems are everywhere: data-driven, algorithmic product suggestions. They are often a chapter or module in intro machine-learning classes. They are in the engines of modern commercial products and platforms. You know it: You have a chat with a friend about Halloween plans, and next time you check your email an ad insists you need a Harry Potter outfit.

Crypto has a founding tradition of emphasizing freedom and privacy. Maybe because of this prevailing cultural trend, the NFT space does not have many recommender systems.

I offer a gentle NFT recommender here. I combine a large set of OpenSea data and some basic machine learning. The result is a standalone module that forgets specific wallets and stores only ownership patterns. These patterns can then be used to let you navigate recommendations voluntarily: recommendations by choice using statistics from the public ledger.

In this blog post, I start with a demonstration that we can predict NFT ownership among wallets that have many asset types. I then present a new interface called Spar. Named after an ancient navigational tool, Spar lets you choose a subset of NFTs that you own or like, then recommends some additional NFT asset types. These recommendations are based on statistical patterns among the many thousands of wallets on OpenSea that own multiple NFTs. Let’s start with an observation that may surprise you.

Most wallets own a single type of NFT asset

In a recent analysis of OpenSea data, I built the following network diagram of ownership. The outside ring of nodes reflects unique ownership — a very large majority (over 90%) of wallets on OpenSea own one kind of asset. The nodes in gray represent wallets that own two or more assets. These “multiple owners” interconnect the ecosystem.

Why so much single-project ownership? More than half of this single ownership is driven by three large projects: CryptoKitties (20,000), MLB Champions (40,000) and CryptoStamp (150,000!). These may have been automatically generated for specific wallets, or represent one-time buyers.

Another reason may be security. Some investors may maintain wallets for particular projects. That way if keys are lost you only lose one type of asset.

Despite this trend of singular ownership, we still have lots of data from wallets that own more than one asset type. In that large data snapshot from OpenSea, 5% or so of over 400,000 wallets own multiple NFT types. That represents 20,000 wallets as a basis for a fun predictive model. Can we predict what NFT someone has from their other NFTs? If you know a wallet has a Cryptovoxels parcel and some SuperRare art, can we predict that it also contains a coveted josie artwork? 

Basic ML for predicting NFT ownership

Let’s take those 20,000 wallets that own multiple types of NFT. For each project (Cryptovoxels, Gods Unchained, etc.) let’s build a simple predictive model. To do this, I separated out a balanced sample of wallets that owned or did not own that project. I then used the pattern of ownership across all other projects to predict which of these wallets indeed owns on that project. Put another way, I randomly selected data that contained 50% owners of, say, CryptoKitties, and 50% that did not. I then used information about what other NFTs those wallets owned (Gods Unchained, Cryptovoxels, etc.) to see if I can tell which of the 50% own CryptoKitties. 

Chance is 50% here, by design. I wanted to see how far we can get with one of the simplest out-of-the-box machine learning tools: multiple regression. I simplified the model by predicting only ownership, not extent of ownership, because there’s a wide distribution (e.g., whales; see my post about ENS distribution as an example). 

In order for this to work, I need a good amount of data. For this first demonstration, I extracted projects that had 180 or more owners (wallets). Can we use ownership patterns to predict whether a wallet contains a given NFT? Here are the results.

Most projects can be predicted well above the 50% chance level. In fact, with this approach, we can predict SuperRare ownership at over 80% effectiveness. It is not necessarily “good” or “bad” that a project’s ownership can be predicted. It could mean a project is making fun cross-project connections; it could mean the project is being “typecast.” And these results may change under a new sample or over time. The general message though is that elementary machine learning is enough to predict ownership.

But how does the model work? The chart below shows the interconnections among projects based on which projects predict ownership of other projects. This visualizes the regression model “under the hood.” When one project is owned, it activates (or deactivates) ones it is connected to, and increases the model’s expectation that a wallet may own it. The regression model is a kind of trained “circuit” that interconnects projects. The arrows pinpoint some big relationships—for example, Known Origin owners also tend to be SuperRare owners.

The NFT ecosystem has statistical structure—participants have distinct tastes. This structure is the basis for recommender systems.

Using the ML model as a recommender system

So what are you missing? Information can come from taking those projects that you do not own, and using your pattern of ownership to make recommendations. I took the trained ML model and embedded it in a little website I called “Spar.”

spar.takenstheorem.now.sh

Spar is the term used for an ancient navigational tool (a kind of crystal) that was used to get solar position in the open sea on a cloudy day. I chose this name not only for the thematic flair, but also because the tool itself is just a simple, early version of an NFT recommender.

Spar just needs some data. To give it some, I don’t ask for your wallet. In the spirit of crypto privacy, I do not track wallets, IPs, or anything else. Spar presents an array of NFT projects, and you click on those you own or like. As you click, the system makes recommendations. Recommended NFTs rise above the crowd. Click on the project to jump to the project’s page on OpenSea.

I hope you find it fun! Thanks to the folks at OpenSea for all the fun data and advice. DM me if you have feature requests or find issues with Spar.

Disclosure: Yeah, I own a few NFTs.

Leave a Comment