Fabian Hertwig’s Blog

GPT Rabbit Hole: The Wild Horses That Weren’t: The Surprising Tale of America’s Free-Roaming Horses

2024-11-02T11:00:00+01:00

Sometimes, I find myself diving deep into rabbit holes when I’m curious about something. Recently, someone mentioned that wild horses in America aren’t actually wild, and that caught my attention. I started asking ChatGPT about it—how do we know, what’s the history, and more. The conversation turned out to be so interesting that I decided to turn it into a blog post. Everything you’re about to read was written by ChatGPT (the o1-preview version), but I’ve refined it a lot through a long conversation, keeping things, editing, and rewriting. Enjoy!

The Wild Horses That Weren’t: The Surprising Tale of America’s Free-Roaming Horses

Picture this: You’re gazing out over a sprawling plain as the sun sets in a blaze of oranges and purples. In the distance, a herd of horses gallops freely, manes flying like they’re in some kind of shampoo commercial. It’s wild, it’s raw, it’s… the epitome of freedom.

But here’s the twist: Those “wild” horses? Not exactly wild.

We’re about to dive into a time-traveling adventure that flips the script on everything you thought you knew about these majestic creatures roaming free across America.

Let’s rewind—way back to about 55 million years ago. North America was the original homeland of horses. Imagine a creature the size of a small dog, with multiple toes, munching on leaves in a lush forest. Meet Eohippus, the ancient ancestor of the modern horse.

Over millions of years, these early forest-dwelling horses evolved into larger, single-toed grazers adapted to open grasslands. They transitioned from nibbling on soft leaves to munching on tough grasses. North America became a vast playground of grasslands where these horses thrived.

Then, around the end of the last Ice Age, approximately 10,000 years ago, horses disappeared from North America. One moment they were everywhere, and then, over a relatively short period, they were gone.

What happened?

The end of the Pleistocene epoch brought dramatic changes to the environment. Climate shifts transformed vast grasslands into forests and tundra. Food sources dwindled, and the habitats horses relied on changed faster than they could adapt.

And then humans showed up—hungry, resourceful humans who looked at horses and saw a walking buffet. The exact reasons are still debated among scientists, but most agree that a mix of climate change and human activities played a role in the horses’ extinction on the continent.

For thousands of years after that, North America was a land without horses. Indigenous peoples built rich, complex societies without ever knowing the thunder of hooves across the plains. No galloping across the plains, no epic horse-mounted hunts—just humans and their own two feet (and sometimes canoes).

Fast forward to 1492. Columbus accidentally bumps into the Americas while looking for a shortcut to India. Over the next couple of centuries, European explorers and settlers arrive, bringing all sorts of things—some good, some bad, and some that would change the continent forever.

One of those things was the horse.

The Spanish, in particular, were big on horses. They used them for exploration, conquest, and just looking generally intimidating. These were domesticated horses, trained and bred for human use.

But horses being the free-spirited creatures they are, some managed to escape. Maybe they got spooked during a thunderstorm, or maybe they just decided they’d had enough of carrying conquistadors around. Whatever the case, these escapees started living it up in the wild, doing horse things—eating grass, making babies, and rediscovering their ancestral homeland.

These free-roaming horses came to be known as “mustangs,” a term derived from the Spanish word mestengo, meaning “stray animal.”

Now, here’s where things get really interesting.

Indigenous peoples, who had been horseless for millennia, suddenly had these new, strange creatures roaming their lands. Over time, they figured out how to catch them, tame them, and incorporate them into their daily lives. And they didn’t just stop at basic domestication—they became some of the finest horsemen the world has ever seen.

By the 17th and 18th centuries, tribes like the Comanche, Sioux, and Cheyenne had become master horsemen. The Comanche, in particular, went from being pedestrian hunters to some of the most skilled horse riders and warriors the world had ever seen—essentially inventing new forms of mounted warfare on the Plains.

They embraced the horse with ingenuity and adaptability. Horses revolutionized hunting (buffalo hunting went from being really hard to ridiculously efficient), travel (why walk when you can ride?), and warfare (now with more horsepower and mobility).

So yes, indigenous peoples like the Comanche didn’t always have horses. But once horses were reintroduced by Europeans, they adopted them quickly and integrated them deeply into their cultures, showcasing remarkable adaptability.

Alright, let’s tackle the elephant—or rather, the horse—in the room.

We often call these free-roaming horses “wild,” but technically, they’re “feral.” What’s the difference?

Wild Horses: Horses that have never been domesticated. The Przewalski’s horse from Mongolia was long considered the only true wild horse species left.
Feral Horses: Horses that are descended from domesticated ancestors but now live in the wild.

Interestingly, recent genetic studies suggest that Przewalski’s horses may themselves descend from early domesticated horses, blurring the lines even further. The scientific community continues to explore this, but for now, it’s clear that the mustangs of the American West are feral horses.

But let’s be honest, “feral horse” doesn’t have the same romantic ring to it. It sounds like a horse that’s going to rummage through your trash. “Feral West” sounds more like a post-apocalyptic movie than the stuff of legends.

Scientists love a good mystery. By digging up fossils, they use radiocarbon dating to figure out when these ancient horses lived and DNA analysis to understand how they’re related to modern horses. They’ve confirmed that there’s a significant gap between the ancient horses that went extinct around 10,000 years ago and the modern horses reintroduced by Europeans.

Radiocarbon Dating Explained (Without Melting Your Brain):

All living things contain carbon, and a tiny fraction of that carbon is a radioactive type called carbon-14. While an organism is alive, it keeps a steady amount of carbon-14 because it’s constantly eating, breathing, or otherwise exchanging carbon with its environment.

When the organism dies, it stops taking in new carbon. The carbon-14 it has starts to decay at a known rate—a half-life of about 5,730 years. This means that every 5,730 years, half of the carbon-14 decays into nitrogen.

Scientists can measure how much carbon-14 is left in a fossil and, knowing the rate of decay, calculate how long it’s been since the organism died. This method is effective for dating materials up to about 50,000 years old, which covers the timeframe we’re talking about.

Using this method, scientists determined that native North American horses disappeared around the end of the last Ice Age. The absence of horse fossils after that time and the lack of horse imagery in indigenous cultures before European contact support this timeline.

The reintroduction of horses didn’t just change things for the horses—it reshaped entire cultures.

For indigenous peoples, the horse was a game-changer. It altered hunting practices, made long-distance travel more feasible, and became central to warfare and status. Societies evolved rapidly, and the horse became woven into myths, stories, and identities.

Their ingenuity and adaptability in integrating the horse into their cultures is a testament to the resilience and innovation of Native American tribes.

But it’s not all majestic gallops into the sunset. Free-roaming horses have a significant ecological impact.

Overgrazing: Horses can overgraze vegetation, leading to soil erosion and degradation of habitats.
Competition with Native Species: They compete with native wildlife like pronghorns and bighorn sheep for food and water resources.
Water Sources: Their presence can affect riparian areas, impacting water quality and availability for other species.

This has led to debates about how to manage their populations humanely and sustainably. While they may not be “wild” in the technical sense, they’re undeniably part of the American landscape now, and balancing their presence with environmental conservation is an ongoing challenge.

Horses in America are like that friend who moves away in elementary school and then suddenly shows up years later, totally transformed. They originated here, disappeared for a long time, and then came back under completely different circumstances.

They’re symbols of freedom and wildness, yet their very existence here is tied to human history and intervention. They represent both the untamed spirit of nature and the complex ways humans interact with the environment.

At the end of the day, the story of America’s free-roaming horses is a reminder that history is full of twists, turns, and unexpected returns. Horses have become an enduring symbol of the American West—not because they’ve always been here, but because of the incredible journey they’ve taken alongside us.

They embody resilience, adaptability, and that wild streak that runs through the heart of the American identity.

So, the next time you see a photo or painting of mustangs running free across the plains, you’ll know the real story. It’s not just a scene of wild beauty; it’s a complex tapestry of evolution, extinction, reintroduction, and cultural transformation.

And that’s way cooler than any myth.

P.S. If you ever get the chance to see these horses in the wild (or, well, the “feral”), take a moment to appreciate the epic saga they’ve been part of. They’re not just horses; they’re living history galloping across the plains.

Streamlining Corporate Decision-Making, Insights from Jeff Bezos

2024-01-07T17:00:00+01:00

In a recent Lex Fridman podcast, Jeff Bezos shared essential leadership insights, emphasizing the need for speed and truth in business decision-making. He discussed strategies for companies to rapidly reach decisions and avoid excessive deliberation. Bezos also delved into practices that facilitate the pursuit of truth, crucial for effective and informed decision-making in the corporate world.

How to become fast at decision making

When discussing the concept of decision-making in businesses, Jeff Bezos’s insights provide a profound perspective, particularly on the common pitfall many companies face: failing to recognize “two-way doors.” This failure often leads to unnecessary delays and hinders agility in the corporate world.

Most decisions in a business, according to Bezos, are “two-way doors.” These are decisions that are reversible and less critical. If a mistake is made, it’s relatively easy to backtrack and choose a different path. However, Bezos advocates that these two-way door decisions should primarily be made by single individuals or very small teams within the organization. This approach empowers teams to act swiftly and efficiently, avoiding the trap of over-deliberation.

Many companies treat these decisions as if they are “one-way doors” - significant, irreversible choices that require extensive deliberation. This cautious approach, while prudent for genuinely critical decisions, becomes a hindrance when applied indiscriminately. By applying the heavy, slow-moving process meant for one-way doors to all decisions, companies inadvertently stall their progress. They spend excessive time analyzing and deliberating choices that could be made quickly and adjusted if necessary. This not only slows down the decision-making process but also stifles innovation and responsiveness to changing market conditions.

Bezos’s philosophy at Amazon was to empower individuals and small teams to make two-way door decisions swiftly, reserving the meticulous, slower approach for the true one-way doors. This balance between caution and speed is crucial. It allows businesses to move quickly on most fronts while still being deliberate where it counts.

Getting to the truth

Tackling Groupthink in Meetings:

Groupthink is a common phenomenon in meetings, especially those involving individuals of varying seniority. Jeff Bezos sheds light on this issue, emphasizing that when a senior member expresses their opinion first, it can inadvertently influence the thoughts of others. This dynamic leads to a situation where diverse opinions may get suppressed or altered in favor of aligning with the leader’s view.

But why does this happen, even among the most competent and confident individuals? The answer lies in our inherent nature as social beings. As Bezos points out, humans are not primarily truth-seeking; we are social animals. Our survival and success have historically depended on our ability to cooperate and align with our social groups. This instinct is deeply ingrained and can subtly influence our behavior in group settings.

In a meeting, when a respected or senior figure voices their opinion, it triggers an almost instinctual response in others to align with that viewpoint. This isn’t necessarily about agreement or disagreement on a rational level; it’s about the social dynamics of respect, authority, and the desire for harmony within the group. Even highly experienced and intelligent individuals are not immune to this social influence.

To counteract this, Bezos practices speaking last in meetings. Ideally, participants state their opinions from most junior to the most senior role, ensuring that all voices are heard in an unfiltered manner. This approach not only encourages honest expression but also highlights the importance of every team member’s perspective.

The Peril of Proxies:

In business, the use of proxies – indirect measures to gauge performance or success – is common. Yet, as Jeff Bezos highlights, the management of these proxies can often lead to skewed decisions and strategies. This usually happens when organizations lose touch with the original purpose behind these proxies.

One major issue is organizational inertia. Over time, the reasons behind the selection of certain metrics as proxies can get lost in the shuffle of daily operations. Teams might continue tracking these metrics out of habit, not because they still provide relevant or useful insights. What made sense as a proxy five years ago might not be relevant today. Markets evolve, consumer behaviors shift, and what once was a reliable indicator of success or performance might now be outdated or misleading. This evolution can render once-crucial proxies ineffective, yet companies may continue to rely on them without recognizing their diminished relevance.

There’s often a lack of critical reassessment of proxies. In many organizations, questioning the validity and effectiveness of established metrics is not a regular practice. This lack of scrutiny can lead to a situation where businesses optimize for metrics that no longer align with their current goals or market realities.

To avoid the pitfalls of mismanaged proxies, Bezos suggests fostering a culture that continuously questions and reassesses these metrics. It’s vital for organizations to regularly review their proxies to ensure they still represent their true objectives and adapt to the dynamic nature of the business environment. This ensures that decision-making and strategy remain focused on actual goals, not just the numbers that are meant to represent them.

Revolutionizing Meetings with the 6-Page Memo:

Jeff Bezos’ introduction of the 6-page memo to meetings at Amazon and Blue Origin marks a significant departure from traditional corporate meeting practices. This method ensures that every participant is not just physically present but also intellectually engaged with the matter at hand.

At the core of this approach is the ‘study hall’ session, where the meeting commences with everyone silently reading a narratively structured memo for about 30 minutes. This practice counters a common problem in many companies where participants come to meetings either unprepared or having only skimmed through the pre-read materials. In such scenarios, discussions can lack depth and understanding, leading to surface-level conversations and often, misguided decisions.

Another critical issue in traditional meetings is the reliance on PowerPoint presentations, which Bezos views as a persuasion tool rather than a means for truth-seeking. Presentations with slides filled with bullet points can be misleading, allowing for vague and incomplete information to be conveyed. This method often leads to discussions that are more about aligning with the presenter’s perspective rather than delving into the actual substance of the issue.

In contrast, the 6-page memo demands comprehensive thinking and clarity from the author. This rigorous process of writing, rewriting, and editing ensures that the author presents their best thinking, leaving little room for ambiguity or half-baked ideas. For the participants, this means they are not spending time trying to extract the presenter’s thoughts during the meeting but are instead coming in with a full understanding of the subject.

Post the reading session, the meeting transforms into a dynamic discussion space, often described by Bezos as ‘messy.’ Here, the real problem-solving occurs, with participants exploring solutions based on a shared understanding developed through the memo. This method is especially effective in preventing higher-ranking individuals from unduly influencing the discussion, as everyone’s input is based on the same detailed document.

This approach also addresses the common problem of interruptions in meetings. Often in traditional settings, senior executives interject with questions, some of which would be addressed later in the presentation. The memo approach eliminates this by providing all the necessary information upfront, allowing for a more structured and focused discussion.

Unlocking the Power of First Principles Thinking: A Timeless Approach to Innovation and Problem-Solving

2023-03-25T21:00:00+01:00

First principles thinking is the superpower that many attribute to Elon Musks success. With Tesla he revolutionized the automotive industry. First doubted by many that he will be able to create an electric car company at all, now many other car manufacturers followed his lead in committing to only building electric cars. With SpaceX Elon Musk reduced the cost of bringing payloads to orbit by 10 fold by making boosters land back on land and reusable for future flights. With that as well experts doubted that it will be possible to land a rocket at all. But Elon Musk’s first principles thinking dictated that if it didn’t violate the laws of physics, it must be possible. After multiple attempts, he successfully demonstrated the viability of this innovative approach.

So what is first principles thinking and how can you become a first principles thinker?

What is First Principles Thinking

First principles comes from philosophy and physics. There a first principle is a fundamental truth that can not be broken down any further. From this fundamental truth you can reason up and explain the processes.

Elon Musk about his way of thinking at TED ¹:

Well, I do think there’s a good framework for thinking. It is physics. You know, the sort of first principles reasoning. What I mean by that is, boil things down to their fundamental truths and reason up from there, as opposed to reasoning by analogy. Through most of our life, we get through life by reasoning by analogy, which essentially means copying what other people do with slight variations. And you have to do that. Otherwise, mentally, you wouldn’t be able to get through the day. But when you want to do something new, you have to apply the physics approach. Physics is really figuring out how to discover new things that are counterintuitive, like quantum mechanics.

First Principles Thinking in Philosophy: Aristotle’s Approach

First principles thinking has its roots in the philosophical teachings of Aristotle, who used this approach to uncover fundamental truths and build a coherent understanding of the world. He called these fundamental truths “archai,” which translates to “beginnings” or “principles.” In his philosophical inquiries, Aristotle sought to understand the essence of things by identifying their fundamental building blocks and deriving knowledge from these basic truths.

To Aristotle, first principles were self-evident and indubitable truths that could not be derived from any other principles. He believed that by starting with these first principles, one could logically deduce other truths and build a solid foundation for understanding the world. Aristotle’s process of reaching these first principles involved a technique called “dialectic,” which was an exploration of different opinions and viewpoints through dialogue and questioning. By engaging in dialectic, Aristotle aimed to peel away the layers of complexity and ambiguity, ultimately revealing the foundational principles that underlie various phenomena.

One famous example of Aristotle’s application of first principles thinking is his concept of the “unmoved mover.” He reasoned that if everything in the universe is in motion, there must be a cause for this motion, and that cause must be something that is itself unmoving. By identifying the unmoved mover as a first principle, Aristotle developed a comprehensive metaphysical framework that accounted for the motion and change observed in the world.

Descartes’ Radical Doubt: “I think, therefore I am”

René Descartes, the 17th-century French philosopher, began his philosophical journey by doubting everything he knew or believed to be true. He questioned the reliability of his senses, the existence of the external world, and even the validity of his own thoughts. Through this process of relentless questioning and doubt, Descartes aimed to identify the most fundamental and self-evident truths, from which he could construct a solid and unshakable foundation for his philosophical system.

In his quest for certainty, Descartes arrived at the realization that the very act of doubting and thinking proved his own existence. He reasoned that even if he doubted everything else, he could not doubt the fact that he was doubting and thinking. This simple yet profound insight led to his famous declaration, “I think, therefore I am.”

This statement encapsulates Descartes’ first principles approach to knowledge and understanding. By questioning everything he identified a fundamental and self-evident truth: his own existence as a thinking being. From this indubitable starting point, Descartes went on to build a comprehensive philosophical system that encompassed the nature of reality, the existence of God, and the relationship between the mind and the body.

The Power of First Principles in Physics

In the late 17th century, a young and inquisitive man named Isaac Newton was studying at the University of Cambridge. A brilliant and curious student, Newton was captivated by the mysteries of the natural world and constantly sought to uncover the fundamental principles that governed it.

One day, as Newton sat beneath an apple tree in the university’s garden, he witnessed an apple falling from a branch above. This simple event sparked a profound question in his mind: what force causes objects to fall towards the ground? Inspired by this question, Newton embarked on a journey to discover the underlying principles that governed the motion of objects on Earth and in the heavens.

Through diligent research and experimentation, Newton formulated his groundbreaking laws of motion, which described the relationship between the forces acting on an object and its motion. Armed with these principles, Newton sought to apply them to the celestial realm and understand the motion of the planets and other celestial bodies.

Focusing on the Moon, Newton wondered if the same force that caused the apple to fall from the tree could also be responsible for the Moon’s orbit around the Earth. To investigate this hypothesis, he considered the gravitational force acting on the Moon and the force required to keep it in orbit.

Through a series of calculations, Newton found that the force needed to maintain the Moon’s orbit was indeed proportional to the gravitational force acting upon it, suggesting that a single force - gravity - was responsible for both the motion of the apple falling to the ground and the Moon’s orbit around the Earth. This revelation was a groundbreaking discovery that unified the understanding of forces acting on both terrestrial and celestial objects.

Newton’s insight led to the formulation of his law of universal gravitation, which stated that every object with mass attracts every other object with mass, with a force that is proportional to the product of their masses and inversely proportional to the square of the distance between them. This law provided a unified framework for understanding the motion of objects on Earth and in the heavens, forever changing the way we perceive the universe.

Through his relentless pursuit of knowledge and his ability to reason up from first principles, Isaac Newton revolutionized our understanding of the natural world. His laws of motion and law of universal gravitation continue to serve as foundational principles in physics, shaping our comprehension of the cosmos and inspiring generations of scientists and thinkers to come.

General Approach to First Principles in Physics

The first step in applying first principles thinking in physics is to formulate a hypothesis for a potential fundamental law. This often involves observing the natural world, identifying patterns, and devising a possible explanation or rule that governs the observed behavior. Physicists draw upon their knowledge of existing theories, as well as their creativity and intuition, to propose new hypotheses that can be tested through experimentation.

Once a hypothesis has been formulated, physicists design and conduct experiments to test its validity. These experiments must be carefully controlled and repeatable, allowing for the accurate measurement of relevant variables and the elimination of potential confounding factors. By comparing the experimental results with the predictions made by the hypothesis, physicists can assess whether the proposed law aligns with the observed data.

If the experimental results consistently support the hypothesis, it may become accepted as a first principle or fundamental law in physics. However, if the results contradict the hypothesis, it may need to be revised or discarded in favor of an alternative explanation.

Developing First Principles Thinking Skills

First principles thinking is a powerful mental tool that can be applied in various aspects of life, not just in physics or philosophy. By learning to think in first principles, you can develop the ability to break down complex problems, identify their fundamental truths, and reason up from there to find innovative and effective solutions. Here are some techniques to help you cultivate first principles thinking:

Ask “why” multiple times: When faced with a problem or question, ask “why” several times to peel back the layers of complexity and uncover the underlying principles. By questioning assumptions and diving deep into the core of the issue, you can identify the fundamental truths from which you can reason up and develop a solution.
Embrace doubt and skepticism: Cultivate a mindset of doubt and skepticism when tackling problems or beliefs. By questioning everything, even your own thoughts, you can identify the most fundamental and self-evident truths as your starting point. This practice enables you to build a solid foundation for understanding and solving problems, grounded in first principles thinking.
Challenge analogies: Many of our beliefs are based on analogies that may not be entirely accurate or relevant. To think in first principles, it’s essential to identify and challenge these analogies, scrutinizing their validity and questioning whether they truly apply in the context of the problem at hand.
Understand analogies and mental models: While first principles thinking encourages reasoning from fundamental truths rather than relying on analogies, it’s still valuable to be familiar with various analogies and mental models. These can serve as useful starting points for your reasoning or doubting process, helping you to identify patterns and connections between seemingly unrelated phenomena. Once you’ve drawn upon these analogies and models, you can then apply first principles thinking to refine and develop your understanding further.
Break down problems into their basic components: By dissecting complex issues into smaller, more manageable parts, you can analyze each component individually and identify the fundamental principles that govern them. This process will help you gain a deeper understanding of the problem and enable you to build a solution from the ground up.
Envision the ideal solution and work backwards: Rather than relying solely on familiar tools and methods, take a moment to imagine the perfect solution to the problem at hand. Ask yourself what the ideal outcome would look like and what characteristics it would possess. Once you have a clear vision of the desired solution, work backward to determine the necessary steps and resources to achieve it. This approach encourages you to think beyond the limitations of conventional methods, fostering creativity and innovation in your problem-solving process.
Embrace curiosity and continuous learning: Developing first principles thinking requires a strong sense of curiosity and a commitment to learning. By nurturing your curiosity and constantly seeking out new knowledge, you’ll be better equipped to identify fundamental truths and reason up from them to tackle complex challenges.

First Hand Examples of First Principles Thinking

Here are a few first hand examples of first principles thinking from Elon Musk. Notice how he applies the skills outlined above.

In the video below he explains how to think from first principles:

Don’t brake the laws of physics.
Think about how things change when you scale something to a very large or very low number. If a part is still expensive when you produce a million a year then the reason is its design.
Anything at volume can be made for costs that asymptotically approach the costs of the raw materials plus intellectual property license rights.
Instead of using the tools and methods that you already know ask yourself: What would be the perfect solution and how can you get to that.

Next he explains how people reason from analogy:

Batteries are expensive and they will always be, because the are expensive right now. But if you break down the material costs of batteries, then you see that these are really cheap and it is the assembly that is expensive. These costs can be reduced by improving the assembly process and increasing the scale.

On the costs of rockets. Again he breaks down the costs of a rocket to the cost of its components and the cost of assembly.

Elon about what a company and profit is:

A company is a assembly of people who gather together to create and deliver a product or service. A company has no value in itself, only in being an effective allocator to create goods and services that are of greater value than the costs of the inputs. Profit is that over time the value of the outputs is greater than the value of the inputs.

Elon about how he is able to attract talent:

If you wanna recruit people that are really talented and driven you have to state the mission and have a convincing argument for why it matters. There a three major things in terms of motivation:

A person is enjoying the work itself intrinsically

The financial compensation is fair and good

The best people wanna know if what they are doing is going to matter, will people notice their work or will the world be different

Conclusion

In conclusion, first principles thinking is a powerful and transformative approach to problem-solving that has been employed by some of the most brilliant minds in history, including Aristotle, Isaac Newton, and Elon Musk. This way of thinking transcends disciplines and can be applied to various aspects of life, from philosophy and physics to business and everyday challenges. By cultivating this mindset and applying it in our own lives, we can unlock our own potential for creative problem-solving and embark on a journey of continuous growth and learning.

References

Elon explains first principles thinking at TED: https://youtu.be/IgKWPdJWuBQ?t=1096 ↩

Python Virtual Environments: The best workflow

2023-01-07T11:00:00+01:00

When I start a new Python project I use pyenv install 3.11 and pyenv shell 3.11 to install and set the Python version (here to 3.11) and then python -m venv .venv to create a virtual environment that sits in my project folder. At last it gets activated with source .venv/bin/activate. I bundled these commands into a function, so that I only need to run mkpyvenv 3.11 and the virtual environment is created and activated.

The reasons for this workflow

Every project should have its own virtual environment, so that each project is independent.

I want to be able to control the Python version for each project, eg. use 3.8 for one project and 3.11 for another.

I want my virtual environment to be in the project folder, so that when I delete the project folder, that virtual environment is also gone. Like that my system does not get littered with virtual environments that I have long forgotten about, as conda or virtualenvwrapper does.

Another benefit is, that my virtual environment does not have a name that I need to remember. I can always activate it with source .venv/bin/activate in the project folder. VSCode automatically activates a virtual environment in a .venv folder, so I don’t even have to do that.

I want to install all my dependencies with pip and a requirements.txt or pyproject.toml file (and not with conda and an environment.yml), as often a project ends up running in a Docker container. As a Docker container is its own virtual environment, I don’t want to have to install another virtual environment manager in that docker container. Also pip is the Python standard, and conda is only common in the scientific community.

The workflow in practice

This is for MacOS.

Installation

Use brew to install pyenv

brew update
brew install pyenv

Then follow the instructions of pyenv init to load pyenv when starting a shell. For the zsh shell that is:

# Load pyenv automatically by appending
# the following to 
~/.zprofile (for login shells)
and ~/.zshrc (for interactive shells) :

export PYENV_ROOT="$HOME/.pyenv"
command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"

# Restart your shell for the changes to take effect.

Then install the Python versions that you want to use. With pyenv versions you can see which ones are already installed.

pyenv install 3.8.16
pyenv install 3.11
...

Creating a virtual environment

Now let us assume we start a new project:

mkdir my_awesome_tool
cd my_awesome_tool

Now we set the Python version for the current shell session:

pyenv shell 3.11

When you run python --version you will see that the Python 3.11.1 version is used (or a newer patch version, as we have not been specific there). With which python you see that the python executable is in the .pyenv/shims directory. With pyenv which python you see that the python executable is stored in the pyenv directory /Users/fabian.hertwig/.pyenv/versions/3.11.1/bin/python.

So let us create a virtual environment in a .venv folder:

python -m venv .venv
source .venv/bin/activate

Now which python points to the virtual environment: /Users/fabian.hertwig/Projects/my_awesome_tool/.venv/bin/python. And again if you run python --version you will see that the Python 3.11.1 version is used. If you install a package, eg. pip install numpy then it will be stored in the .venv/lib/python3.11/site-packages/numpy directory.

Making shortcuts

To easily run through that process with just one command mkpyvenv 3.11, you can add the function below to your shell configuration file, e.g. .zshrc. Or you can use the awesome fig tool to create a dot file there which gets shared across all your fig installations.

mkpyvenv() {
    # Check if an argument was given
    if [ -z "$1" ]; then
        echo "Please specify a Python version to use, e.g. mkpyvenv 3.9.4"
        return 1
    fi
    PYTHON_VERSION=$1

    # Check if pyenv is installed
    if ! command -v pyenv > /dev/null; then
        echo "pyenv is not installed. Please install it, e.g. by running `brew install pyenv`"
        return 1
    fi

    # Install the python version if it does not exist
    pyenv install --skip-existing $PYTHON_VERSION

    # Create the virtual environment and activate it
    pyenv shell $PYTHON_VERSION
    python -m venv .venv
    pyenv shell --unset
    source .venv/bin/activate
}

Metrics for Information Retrieval

2023-01-04T14:00:00+01:00

In the past year I have built a neural search system on top of the awesome Haystack project. One of the tasks was to understand how well different models or algorithms perform for our corpus. Therefore I needed to understand the metrics that are commonly used in information retrieval tasks. I could not find one source that described them neatly, therefor I created this post. The metrics explained in this post are the ones that the BEIR benchmark currently reports:

NDCG@k - Normalized Discounted Cumulative Gain at k
MAP@k - Mean average precision at k
Precision@k
Recall@k
R_cap@k - Capped Recall
MRR@k - Mean Reciprocal Rank at k

These metrics calculate performance for a ranking task. Ranking is the more general task of search systems. Given:

a corpus of documents or passages
a set of queries
a set of relevance scores, which define for each query how relevant each document is, in the simplest case by marking them with 0 or 1.

The system should retrieve the most relevant documents for each query and show them at the top rank.

Of course the system does not know the relevance scores and has to calculate them on its own. To calculate the metrics though you need them. We used a feedback system, where users could vote on search results if they are relevant for their search or not.

Example

I will use the following example to illustrate the metrics.

Corpus:

Document 1: "How to train a cat"
Document 2: "How to train a dog"
Document 3: "How to train a parrot"
Document 4: "How to train a hamster"

Queries:

Query 1: "train cat"
Query 2: "train dog"
Query 3: "train parrot"

Relevance scores:

Let us also assume, the the search system returned the documents in this order for each query.

Query 1, Document 1: 1.0 (maximum relevance)
Query 1, Document 2: 0.5
Query 1, Document 3: 0.3
Query 1, Document 4: 0.1

Query 2, Document 1: 0.7
Query 2, Document 2: 1.0 (maximum relevance)
Query 2, Document 3: 0.2
Query 2, Document 4: 0.1

Query 3, Document 1: 0.4
Query 3, Document 2: 0.2
Query 3, Document 3: 1.0 (maximum relevance)
Query 3, Document 4: 0.1

NDCG@k - Normalized Discounted Cumulative Gain at k

Normalized Discounted Cumulative Gain at k (NDCG@k) is a metric used to evaluate the performance of a ranking model. NDCG@k measures the usefulness of the top k items in the ranking, taking into account both the relevance of the items and their order in the ranking.

To calculate NDCG@k, the discounted cumulative gain (DCG) of the top k items in the ranking is first calculated. The DCG is a measure of the relevance of the items in the ranking, where higher relevance scores are given more weight. The DCG value is calculated by summing the product of the relevance of each item and a discount factor that decreases as the rank of the item increases.

Next, the maximum possible discounted cumulative gain (IDCG) of the top k items is calculated. This is the DCG of the ideal ranking, where the most relevant items appear at the top.

Finally, the NDCG@k score is calculated by dividing the DCG of the top k items by the IDCG of the top k items. The NDCG@k score ranges from 0 to 1, with a higher score indicating a more useful ranking.

Formula

NDCG@k = DCG@k / IDCG@k

Where:

DCG@k is the discounted cumulative gain for the top k items in the ranking
IDCG@k is the maximum possible discounted cumulative gain for the top k items

The formula for DCG@k is:

DCG@k = ∑ rel_i * (1 / log_2(i+1)) for i = 1 to k

Where:

rel_i is the relevance of the item at rank i
k is the number of items in the top k portion of the ranking

The formula for IDCG@k is:

IDCG@k = ∑ max_rel_i * (1 / log_2(i+1)) for i = 1 to k

Where:

max_rel_i is the maximum relevance among all items at rank i

Example:

Here is an example of how NDCG@k can be calculated for the corpus, set of queries, and set of relevance scores from above:

Let’s say we want to calculate NDCG@2 for each query. First, we need to calculate the DCG@2 and IDCG@2 values for each query.

For Query 1:

DCG@2 = 1.0 + 0.5 * (1 / log_2(3)) = 1.0 + 0.5 * (1 / 1.585) = 1.32
IDCG@2 = 1.0 + 0.5 * (1 / log_2(3)) = 1.0 + 0.5 * (1 / 1.585) = 1.32
NDCG@2 = DCG@2 / IDCG@2 = 1.32 / 1.32 = 1.0

For Query 2:

DCG@2 = 1.0 + 0.7 * (1 / log_2(3)) = 1.0 + 0.7 * (1 / 1.585) = 1.45
IDCG@2 = 1.0 + 1.0 * (1 / log_2(3)) = 1.0 + 1.0 * (1 / 1.585) = 1.62
NDCG@2 = DCG@2 / IDCG@2 = 1.45 / 1.62 = 0.89

For Query 3:

DCG@2 = 1.0 + 0.4 * (1 / log_2(3)) = 1.0 + 0.4 * (1 / 1.585) = 1.25
IDCG@2 = 1.0 + 1.0 * (1 / log_2(3)) = 1.0 + 1.0 * (1 / 1.585) = 1.62
NDCG@2 = DCG@2 / IDCG@2 = 1.25 / 1.62 = 0.77

In this example, the ranking for Query 1 has an NDCG@2 score of 1.0, which means it is a perfect ranking. The ranking for Query 2 has an NDCG@2 score of 0.89, which means it is a good ranking but not perfect. The ranking for Query 3 has an NDCG@2 score of 0.77, which means it is a lower quality ranking.

MAP@k - Mean average precision at k

Mean Average Precision at k (MAP@k) is a metric used to evaluate the performance of a ranking model, particularly in information retrieval tasks such as search engines. It measures the average precision of the top k items in the ranking, taking into account both the relevance of the items and their order in the ranking.

Precision is a measure of the proportion of relevant items in the ranking. For example, if a ranking contains 4 items and 2 of them are relevant, the precision of the ranking is 0.5.

To calculate MAP@k, the precision of the top k items in the ranking is first calculated for each query. The precision is calculated as the number of relevant items in the top k portion of the ranking divided by k. Then, the average precision is calculated by taking the mean of the precision values across all queries.

The formula for MAP@k is:

MAP@k = 1/Q * ∑ (Precision@k of query q) for q = 1 to Q

Where:

Q is the number of queries
Precision@k of query q is the precision of the top k items in the ranking for query q

The MAP@k score ranges from 0 to 1, with a higher score indicating a more useful ranking.

Example

Let’s say we want to calculate MAP@2 for each query, using a relevance threshold of 0.5. First, we need to calculate the precision@2 for each query.

For Query 1:

precision@2 = (number of relevant items in top 2) / 2
= (2 relevant items) / 2
= 1.0

For Query 2:

precision@2 = (number of relevant items in top 2) / 2
= (1 relevant item) / 2
= 0.5

For Query 3:

precision@2 = (number of relevant items in top 2) / 2
= (0 relevant item) / 2
= 0.0

Then, the MAP@2 score is calculated as the mean of the precision@2 values:

MAP@2 = 1/3 * (1.0 + 0.5 + 0.0)
      = 1/3 * 1.5
      = 0.5

Precision@k

Precision@k measures the proportion of relevant items in the top k items of the ranking, relative to the total number of items in the ranking.

The Precision@k score is calculated by dividing the number of relevant items in the top k portion of the ranking by k. The Precision@k score ranges from 0 to 1, with a higher score indicating a more useful ranking.

The formula for Precision@k is:

Precision@k = (number of relevant items in top k) / k

For example, suppose we have a corpus containing 10 documents, and we want to calculate the Precision@5 score for a given query. If the top 5 documents in the ranking contain 3 relevant documents, the Precision@5 score would be calculated as follows:

Precision@5 = (number of relevant items in top 5) / 5
            = 3 / 5
            = 0.6

In this example, the Precision@5 score is 0.6, which means that 60% of the top 5 documents in the ranking are relevant.

The Precision@k metric is defined even when there are no relevant documents in the corpus, whereas the Recall@k metric is not defined in this case. This makes Precision@k a useful evaluation metric when the number of relevant documents in the corpus is small or when the relevance threshold is set very high.

Example

Let’s say we want to calculate Precision@2 for each query, using a relevance threshold of 0.5. First, we need to count the number of relevant items in the top 2 items of the ranking for each query.

For Query 1:

number of relevant items in top 2 = 1
Precision@2 = (number of relevant items in top 2) / 2
= 2 / 2
= 1.0

For Query 2:

number of relevant items in top 2 = 1
Precision@2 = (number of relevant items in top 2) / 2
= 2 / 2
= 1.0

For Query 3:

number of relevant items in top 2 = 0
Precision@2 = (number of relevant items in top 2) / 2
= 0 / 2
= 0.0

In this example, the Precision@2 scores are 0.5, 0.5, and 0.0 for Query 1, Query 2, and Query 3, respectively. This means that 50% of the top 2 documents in the ranking are relevant for Query 1 and Query 2, and none of the top 2 documents in the ranking are relevant for Query 3.

Recall@k

The Recall@k score is calculated by dividing the number of relevant items in the top k portion of the ranking by the total number of relevant items in the corpus. The Recall@k score ranges from 0 to 1, with a higher score indicating a more useful ranking.

The formula for Recall@k is:

Recall@k = (number of relevant items in top k) / (total number of relevant items)

For example, suppose we have a corpus containing 10 documents, and we want to calculate the Recall@5 score for a given query. If the top 5 documents in the ranking contain 3 relevant documents and there are a total of 5 relevant documents in the corpus, the Recall@5 score would be calculated as follows:

Recall@5 = (number of relevant items in top 5) / (total number of relevant items)
         = 3 / 5
         = 0.6

In this example, the Recall@5 score is 0.6, which means that 60% of the relevant documents in the corpus are included in the top 5 documents of the ranking.

Example

Let’s say we want to calculate Recall@2 for each query from the top, using a relevance threshold of 0.5. First, we need to count the number of relevant items in the top 2 items of the ranking for each query, and the total number of relevant items in the corpus.

For Query 1:

number of relevant items in top 2 = 1
total number of relevant items = 2
Recall@2 = (number of relevant items in top 2) / (total number of relevant items)
= 1 / 2
= 0.5

For Query 2:

number of relevant items in top 2 = 1
total number of relevant items = 1
Recall@2 = (number of relevant items in top 2) / (total number of relevant items)
= 1 / 1
= 1.0

For Query 3:

number of relevant items in top 2 = 0
total number of relevant items = 0
Recall@2 = (number of relevant items in top 2) / (total number of relevant items)
= 0 / 1
= 0

If the total number of relevant documents in the corpus is 0, the Recall@k score is not defined. This can happen when there are no relevant documents in the corpus for a given query, or when the relevance threshold is set too high such that no documents in the corpus meet the threshold.

R_cap@k - Capped Recall

The R_cap@k metric is a variant of the Recall@k metric. It measures the proportion of relevant items in the top k items of the ranking, relative to the total number of relevant items in the corpus, but caps the total number of relevant documents to k.

The formula for R_cap@k is:

R_cap@k = (number of relevant items in top k) / min(k, total number of relevant items)

Measuring Recall@k can be counterintuitive, if a high number of relevant documents (> k) are present within a dataset. For example, consider a hypothetical dataset with 500 relevant documents for a query. Retrieving all relevant documents would produce a maximum R@100 score = 0.2, which is quite low and unintuitive. To avoid this we cap the recall score (R_cap@k) at k for datasets if the number of relevant documents for a query greater than k. ¹

Example:

Suppose we have a corpus containing 10 documents, and we want to calculate the Recall@5 and R_cap@5 scores for a given query. If the top 10 documents in the ranking are ranked as follows:

Rank 1: Relevant document
Rank 2: Relevant document
Rank 3: Relevant document
Rank 4: Non-relevant document
Rank 5: Non-relevant document
Rank 6: Relevant document
Rank 7: Relevant document
Rank 8: Relevant document
Rank 9: Non-relevant document
Rank 10: Relevant document

There are a total of 7 relevant documents in the corpus, the Recall@5 and R_cap@5 scores would be calculated as follows:

Recall@5 = (number of relevant items in top 5) / (total number of relevant items)
         = 3 / 7 
         = 0.43

R_cap@5 = (number of relevant items in top 5) / min(k, total number of relevant items)
        = 3 / min(5,7)
        = 3 / 5
        = 0.60

MRR@k - Mean Reciprocal Rank at k

Mean Reciprocal Rank (MRR@k) measures the average reciprocal rank of the first relevant item in the ranking, where the reciprocal rank of an item is defined as 1/rank.

The MRR@k score is calculated by summing the reciprocal ranks of the first relevant item in the top k items of the ranking for each query, and dividing the sum by the number of queries. The MRR@k score ranges from 0 to 1, with a higher score indicating a more useful ranking.

The formula for MRR@k is:

MRR@k = sum(1/rank of first relevant item in top k) / number of queries

For example, suppose we have a corpus containing 10 documents, and we want to calculate the MRR@5 score for a set of queries. If the top 5 documents in the ranking for the first query contain the first relevant document at rank 3, and the top 5 documents in the ranking for the second query contain the first relevant document at rank 1, the MRR@5 score would be calculated as follows:

MRR@5 = (1/3 + 1/1) / 2
      = (0.33 + 1.00) / 2
      = 1.33 / 2
      = 0.67

In this example, the MRR@5 score is 0.67, which means that on average, the first relevant document in the ranking is at rank 3 for the first query and rank 1 for the second query.

Example

Let us again assume the threshold for a document being relevant is a score of greater or equal than 0.5.

For Query 1, the first relevant document (Document 1) is at rank 1, so the reciprocal rank is 1/1 = 1.0.
For Query 2, the first relevant document (Document 2) is at rank 1, so the reciprocal rank is 1/1 = 1.0.
For Query 3, the first relevant document (Document 3) is at rank 3, so the reciprocal rank is 1/3 = 0.33.

The MRR@2 score is then calculated as the mean of the reciprocal ranks:

MRR@2 = (1.0 + 1.0 + 0.33) / 3
      = 2.33 / 3
      = 0.78

References

BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models ↩

Feedback Loops are the Key Concept to build awesome Data Products

2022-01-04T14:00:00+01:00

A useful product attracts more users which generate data that can be used to improve the product. That is the concept of the virtuous circle of AI. Tesla uses it to improve the Autopilot, Netflix to show the right movies to each user and even startups to validate their idea or to train robots to sort trash. In this post I will explain the concept and show how these companies implement it.

The Virtuous Circle of AI

The first time I heard of the virtuous cycle of AI was in this presentation from Andrew Ng, where he briefly describes it. Short enough so that you understand it, but too short to understand the significance of it. If you like, watch it from 14:40 to 16:40.

The circle states that a useful product will attract users. These users generate data and that data can be used to improve the product. That is a positive feedback loop and if you run that for some time, you have such valuable data, that your product evolved to be the best of its kind. It becomes very hard for others to reproduce your product or compete with you. Because if your product is so far ahead then it is hard for them to attract new users so that they create the needed data. To explain that further, let us look at some companies that implement the virtuous circle.

The Tesla Data Engine

The company that implements the circle top-notch is Tesla. The most famous one is the Tesla Autopilot¹ and there are even more circles, for example the data driven safety program². Tesla’s Autopilot is self driving car features, which allows the car to steer itself. Right now it needs constant supervision from the driver, but the vision is that the car can drive completely on its own.

Tesla builds electric cars, which is a useful product by itself to many people. To make the autonomous driving features useful from the beginning, Tesla integrated the Intel Mobile Eye system into their early cars before building their own self driving car system. That system powered lane keeping and traffic aware cruise control³. So Tesla has a useful product, which attracts users. Because there are millions of Tesla cars driving around, Tesla can effectively use the fleet to collect data about the Autopilot system. For example, they once observed the problem, that bikes attached to the back of a car get recognized as bikes traveling along the street, but should be recognized as a part of the car. So they issued commands to the fleet to collect images of bikes on cars. The fleet sends these images back to Tesla, they can then label these images correctly and use this dataset to retrain their self driving Neural Networks, e.g. improve their product. Tesla calls this system the Data Engine and presented it for the first time at the Tesla Autonomy Day, see the video below from 2:05:18 to 2:12:50. If you want to also get a good explanation of how Neural Networks work, then start at 1:52:00.

Tesla built the data feedback loop very intentionally into the cars. The Autopilot Hardware has additional computational capacity which allows Tesla to run Neural Networks in shadow mode, where they run in parallel to the networks in control, but are just observing the world or the system that is in control. Tesla also uses shadow mode to deploy triggers on the fleet, for example to send back video sequences where bikes are attached to the car or when the driver had to intervene⁴. Or they deploy the next version of a Neural Network in shadow mode and send back all the predictions that turned out to be wrong, for example when a model predicted that another car would cut into the line, but it did not.

Spotify, Netflix, YouTube, TikTok and other Recommenders

Streaming services like Spotify, Netflix, YouTube, and TikTok made the virtuous cycle their whole business model. Because their main business is not streaming media, but recommending media. If they were not able to recommend you the next good song to listen to, or series or video to watch, you would probably move to another streaming service that is able to do that. And to be able to recommend every user what they probably like, they use the data of the preferences of other users.

TikTok takes being a recommender to the extreme. TikTok is a platform where users can upload short videos and remix them with music or other user’s videos. When you open the app, you land on the #ForYou screen and a video that TikTok recommends you starts playing. The video will play on repeat until you scroll down to jump to the next video. If you double tap the screen you like the video. Over time TikTok will learn what interests you based on videos you liked previously, watched until the end or repeatedly or skipped. From time to time the app will throw in a video that is not in your interests either to challenge what they know about your interests or to get a sense of what is interesting to a broader user range⁵. So TikTok is continuously running the virtuous cycle of AI. The App is useful, because you can endlessly watch videos that entertain you and therefore it attracts more users. These users interact with videos and TikTok can learn which videos fit into which interest group and improve their recommendation engine. This improves the product as users get to see even more entertaining videos and the loop continues.

When a user opens the Netflix main page and does not find anything intriguing in 90 seconds, she will loose interest and move on to do something else⁶. If so Netflix has failed to deliver: A user wanted to watch something, but could not find anything and left. That is why Netflix personalizes the complete homepage to the user. On the page are rows of grouped videos, for example recommendations based on previously watched movies or genres. Inside the groups the videos are ranked by how interesting a video might be to the specific user⁷. Netflix even adapts the cover image of each video to the user⁸. As an image says more than a thousands words the cover image is the most important evidence for a user to decide if a movie or show might be interesting to her. Therefore the image should show which of the user interests the movie could satisfy: Is there an actor that the user likes, is there an action loaded chasing scene in it, a romantic relationship or is it about a mysterious sighting? To select a good image, Netflix sources multiple cover artworks that show different aspects of the movie. A system then learns which of these artworks is a good choice for each user by showing the different artworks to the user base and observing which image/movie/user combination led the users to select the movie and watch it. Before Netflix personalized the artworks, they simply showed the same artwork to every user. To make sure, that the new system leads to a better user experience, Netflix ran A/B test. In an A/B test the users a split up into groups. One is the control group which gets to see what the current system produces, the static artworks. Then there are also one or more experimental groups, where the users get to see the new system, the personalized artworks. Then different metrics are tracked for each group to see if the new system improves the user experience. The metrics could be the streaming hours or user retention. Netflix found that personalized artworks are a meaningful improvement and rolled it out to the whole user base.

A Butcher Running Tests on Customers

When you want to use the virtuous circle and you already have users that use your product, then you simply need to become data driven. That means inform your decisions with data, for example what features to develop or remove and in which direction to move on. Bernard Marr tells in his book Big Data in Practice⁹ the wonderful story of a butcher becoming data driven because of the threat of a supermarket chain that opened nearby. They installed cheap small sensors near the display window, sandwich board and inside the shop. These sensors are able to pickup smartphone signals and can therefore measure how many people stopped at the window, board and hopefully went into the shop. That allowed them to run tests on what to display in their window and what to write on the board and measure how that affects customer numbers. Think about that, a butcher running A/B tests! They found out that a recipe fitting the current time of year was more effective in attracting customers than a message advertising a cheap price. They also found out that a lot of foot traffic passed their shop in the late evening, when their shop was long closed, because there were two pubs nearby. Opening the shop in the late evening to sell sandwiches to the pub dwellers turned out to be a lucrative additional business.

Buffer Testing the Product Idea

Buffer is a company that allows you to prepare Twitter posts and schedule them to be posted in the future. Even before they had built the product and it was merely a good idea, they started the virtuous cycle. They created to most minimum valuable product you can imagine and tested the value of their idea.

_{Source: Idea to Paying Customers in 7 Weeks: How We Did It}

They created a landing page which described the the product in a few lines of text. Next to it was a button Plans and Pricing. When visitors clicked the button they landed on the next page that stated: Hello! You caught us before we are ready. If you’d like us to send you a reminder once we are ready put your email in below¹⁰. From the number of people that clicked the Plans and Pricing button and left their E-Mail, they got a pretty good idea of how useful the product would be to users. To be able to measure the value of their idea, they added a new second page after the Plans and Pricing button that actually showed some plans and pricing. When visitors then clicked on one of the plans, then they got to the page where they can leave their E-mail. This is probably a better example for Lean Startup than for the virtuous cycle of AI, but I wanted to include it to show with how little you can start collecting data. Here, Buffer does not even have a product or users, but just the idea that there might be a product in the future is already useful to some people. And if they click through your landing page they give you data on which you can base your decision.

Collecting Data to be able to Train Robots

To get back to a example where AI is actually the key component, let us look at AMP Robotics who started with a unique dataset. They build robots that can sort trash to revolutionize recycling. Therefore a computer vision system must be able to detect which kind of trash is on a conveyor belt so the robot arm can pick it up and throw it in the right bin.

When they started they setup a small demo conveyor belt at their lab. The CEO went dumpster diving on the weekends to find an assortment of bottles and cans for their dataset¹¹. Once the janitor even thought they threw a big party and was about to throw away all the trash that they collected, luckily he called in first. With the small dataset and the lab setup they were able to built a compelling demo and got to talk to recycling site operators. But they knew that the demo only really worked in their lab setup and would struggle under different lighting conditions and trash types. On site, they were allowed to record more video of the actual conveyor belts that transport trash which they annotated to improve their system. Once they felt ready to put a trash sorting robot arm on site they purposefully set it up somewhere, where it could not create a lot of harm, as they knew they need to collect and label more data before their robot could detect the different types of trash accurate enough to be valuable. Their robots are connected to their cloud infrastructure and send back images. These get annotated, added to the dataset and new computer vision models are trained, that the robots can then download and use. This is how AMP Robotics runs the virtuous cycle of AI and continuously improves the system over time.

Conclusion

The virtuous circle of AI: A useful product attracts more users which generate data that can be used to improve the product. Tesla uses it with their data engine to collect data from the fleet to train new versions of Autopilot Neural Networks. TikTok uses it to learn which videos are most entertaining to each user. Netflix improves the user experience by showing cover art that explain to users if a movie is interesting to them. A butcher used it to find out how to contact users in and around their shop. Buffer started without users, data or a product but tested the product idea. And at last the CEO of a robotics company goes dumpster diving to collect assets for their dataset. These were just a few examples, but I hope they showed a good range of applications and entry points.

If you want to implement to virtuous circle then think about how you could learn from users, how you can test product improvements, what is your dataset and can you improve it from user interactions.