AI Safety, Ethics & Society course

Chapter 5: Complex Systems

This chapter makes the argument that AI systems and the societies they operate within are complex systems, which makes AI safety a wicked problem with no single solution, potential unintended consequences from interventions, and requiring ongoing effort.

AI Safety is a Wicked Problem

Puzzles (like sudoku) have one correct answer with all needed information provided. Problems (like car repair) require investigation but have clear solutions once understood. Wicked problems (like climate change) have no single explanation or solution, arise in complex systems, often involve social elements, and require ongoing management rather than one-time fixes.

Why is AI safety a wicked problem? Because AI systems are complex systems.

Simple vs. Complex Systems

A simple system like a mechanical clock has parts that interact predictably and linearly. You can understand the whole by analyzing individual components and precisely predict behavior.

A complex system like an ecosystem (e.g. a forest) consists of many species (plants, animals, microbes) interacting with each other and their environment. These interactions are nonlinear: small changes, like the introduction or removal of one species, can lead to large, unpredictable effects on the whole system. Emergent behaviors arise which cannot be understood by looking at any single species in isolation.

Characteristics of Complex Systems

So what properties make a system complex?

Emergence. System-level behaviors arise that you can’t predict from individual parts. For example, models can develop unexpected abilities that they weren’t explicitly trained for.

Nonlinearity. Small changes cause disproportionately large effects. Tiny changes in AI training can completely alter system behavior.

Feedback loops. Components influence each other in cycles, creating ongoing effects. AI systems create these loops because they learn from user interactions, which changes their behavior, which then affects how users interact with them, creating continuous cycles of influence.

Adaptation. Systems change based on experience. AI systems continuously learn and adjust rather than staying fixed.

Interconnectedness. Components are linked, creating dependencies. In AI, different parts like data, algorithms, hardware, and users are all connected, so problems in one area can ripple through the entire system.

Sensitivity to initial conditions. Small starting differences lead to vastly different outcomes. Minor training variations produce completely different AI behaviors.

Self-organization. Systems develop patterns without central control. AI systems do this during training when they develop their own internal patterns and ways of organizing information without anyone telling them how to do it.

But it’s not just the technical AI system that is complex. The organizations developing AI and the policy-making bodies regulating it are themselves complex systems.

This means AI safety cannot be reduced to a technical problem: even if we solve the technical challenges, the complex social systems surrounding AI can still create risks. The entire social context around AI resists simple solutions.

What this means for AI safety

Traditional analysis fails because it assumes predictable, decomposable systems. Complex systems break these assumptions through nonlinear interactions and emergent properties.

So what does this mean?

1) Trial and error is necessary. We can’t anticipate all AI behaviors just by thinking about them. Some of the most important safety variables will likely be discovered by accident through experimentation.

2) Even if we specify an AI’s goals perfectly, it may start not to pursue them in practice, as it may instead pursue unexpected, distorted subgoals. AI systems might break down goals into subgoals that become ends in themselves. These subgoals could be pursued at the expense of the original goal, leading to misalignment.

3) Scaling changes everything. Safe small systems aren’t guaranteed to remain safe when scaled up due to emergent properties.

4) Incremental development is best. We’re unlikely to build large, safe AI systems from scratch. Starting with safe smaller systems and gradually building up is more likely to succeed, though it’s not a guarantee.

5) Human oversight has limits. Humans are unreliable, and AI processes may happen too quickly for human oversight to be practical

AI safety is a wicked problem with no single solution, potential unintended consequences from interventions, and requiring ongoing effort rather than a one-time fix.