Thursday, January 29, 2026

Formal Verification: The Final Line of Defense in the AI Era

 On July 19, 2024, a security software update from CrowdStrike caused 8.5 million Windows systems worldwide to crash. Flights were grounded, banks went offline, and hospitals were paralyzed, resulting in economic losses exceeding 10 billion dollars. The cause of this disaster was a mere C++ Out-of-bounds Read memory error.

Imagine if this line of code hadn't been written by a human engineer, but generated by an AI predicting the next token based on probability.

As AI begins to take over infrastructure code, the paradox we face is obvious: We have handed the steering wheel to a novice driver whose "vision is sometimes blurry," yet we demand they drive more steadily than a veteran.

To untie this knot, I believe a long-neglected "veteran" is bound to play a crucial role—Formal Verification.

The Limitations of Traditional Testing

Every programmer has written unit tests: Input A, expect B. This works well in traditional software development because human-written logic is relatively linear; covering typical boundary cases is usually enough to ensure system stability.

But AI-generated code breaks this assumption.

Let's take a concrete example: A high-concurrency ticket booking system.

Suppose an AI writes a logic for inventory deduction: "Check if inventory > 0, then deduct 1, otherwise report error."

Human testers run 100,000 tests. Single-threaded? No problem. 100 concurrent requests? No problem. Deploy!

But hidden here is an extremely elusive bug:

If two users click "Buy" within the same millisecond, Request A checks the inventory (sees 1) but hasn't deducted it yet; Request B immediately follows and also checks the inventory (still sees 1).

The result: Two tickets are sold, but the inventory only decreases by 1. This is the disastrous "overselling" scenario.

Routine testing can almost never catch this kind of bug because it relies on extremely coincidental microsecond-level timing.

This is exactly why tech giants mandate the use of formal verification (like TLA+) to design the implementation of core distributed algorithms.

Formal Verification takes a different approach: Instead of "running code" to verify it, it exhausts all possible execution sequences.

Like an observer of parallel universes, it deduces every possibility and then throws a counterexample directly in your face:

"Attention: When Thread A pauses at line 3, and Thread B inserts execution right there, the inventory constraint is violated."

It is not looking for bugs; it is proving the system's logical completeness.

Testing can only prove "bugs exist"; it can never prove "bugs do not exist."

When autonomous agents start controlling databases, calling payment APIs, or even commanding physical devices, a 0.01% error rate is unacceptable. What we need is a 100% mathematical proof.

This is the value of formal verification: It doesn't run test cases, but proves through mathematical logic: "No matter what the input is, the system will never violate safety rule X."

From Ivory Tower to Industrial Battlefield

The core reason why formal verification hasn't been widely adopted for a long time is its extremely high cost.

It used to be the exclusive domain of chip design and aerospace control systems. Take the famous seL4 microkernel project as an example: to mathematically prove that its 10,000 lines of C code had zero bugs, a top research team spent nearly 11 person-years.

Moreover, the scale of verification code is often more than 10 times that of the functional code, and it requires experts proficient in TLA+ or Coq to write it by hand. For internet companies pursuing "small steps and fast running," this "aristocratic technology" was obviously too extravagant.

Interestingly, AI, the very initiator of this "chaos," happens to be the key to lowering verification costs.

In the AI era, the supply and demand logic of formal verification has flipped:

On one hand, the speed at which AI generates code far exceeds the speed of human review. If we cannot automatically verify accuracy, AI's high output will become a high risk. We need an objective mechanism to audit AI-generated code.

On the other hand, the formal specifications that used to be the hardest to write are now being written quite well by AI. Large Language Models (LLMs) excel at translating natural language requirements (e.g., "transfer amount cannot be negative") into rigorous mathematical logic code.

Google DeepMind's AlphaProof is a prime example. It uses AI to generate formal mathematical proofs, solving three extremely difficult algebra and number theory problems. Coupled with AlphaGeometry 2, which solved a geometry problem, AI achieved a silver medal level at the 2024 International Mathematical Olympiad. Proofs that used to take human mathematicians weeks can now be completed by AI in hours.

We are entering a new development paradigm:

Human defines rules (Natural Language) → AI writes code → AI writes proofs → Checker validates proofs.

If the checker passes, it means the code is mathematically safe.

2026: No Longer Just an Ivory Tower Skill

By 2026, I see that formal verification is no longer a "dragon-slaying skill" in the ivory tower, but is becoming a standard for top tech companies and a direction for frontier exploration.

Besides Google DeepMind's continued investment, Microsoft and AWS are also doubling down on integrating formal methods into the core development processes of their cloud infrastructure. Even more exciting is that a batch of emerging forces are committed to democratizing this "aristocratic technology":

  • Theorem (backed by YC) is using AI to speed up formal verification by 10,000 times, attempting to make it accessible to every ordinary web developer.
  • Infineon's Saarthi is dubbed the "first AI formal verification engineer," capable of autonomously completing the entire process from planning to generating assertions (SystemVerilog Assertions).
  • Tools like Qodo (formerly CodiumAI) are embedding verification capabilities directly into IDEs, allowing developers to perform math-level security checks while writing code.

The industry is evolving from "Test-Driven Development" (TDD) to "Proof-Driven Development". We are building not just code, but software systems that are Trustworthy-by-default.

Tying the Red String to the AI Kite

In my previous article "Anchor Engineering: The Last Mile of AI Landing," I compared AI to a kite in the sky. It flies high and free, but the string can break at any moment.

Formal verification is that strongest red string. It is the absolute safety boundary explicitly defined for AI.

Within this boundary, AI can freely unleash its creativity, optimizing algorithms, generating copy, and refactoring architecture. But once its behavior touches the red line—such as attempting to bypass permission checks or generating deadlock code—the formal verification mechanism will intercept it immediately.

This is no longer a simple "test failed," but "mathematically impossible to hold."

In the AI era, formal verification is bound to become infrastructure. It is the absolute line of defense built by human rationality for AI's wild imagination.

Wednesday, January 28, 2026

Anchoring Engineering: The Last Mile of AI Adoption

 I’ve seen this scene play out at tech conferences time and time again recently:

Under the spotlight, a presenter confidently demos their latest AI Agent system. Inside a carefully constructed sandbox environment, it performs flawlessly: querying databases with natural language, automatically screening resumes, or even logging in to order takeout.

“Incredible!” The CEOs in the audience watch with glittering eyes. “This is the productivity revolution we’ve been waiting for!”

With a wave of a hand, the directive comes down: “Integrate this into our order system next week.”

One week later, the engineering team quietly rolls back all the code.

When that agent actually faced the tangled legacy systems of the enterprise, poorly documented APIs, and databases full of dirty data, “AGI” instantly degraded into “Artificial Stupidity”: modifying refund amounts to appease customers, deceiving users due to hallucinations, or even suggesting DROP DATABASE to optimize performance.

Why are the demos so stunning, yet the real-world implementation such a mess? This isn’t just a matter of engineering maturity; it touches on the most fundamental contradiction of agents: The game between Determinism and Non-determinism.

Uncertainty: The Inevitable Cost of Intelligence

Before diving into technical details, we must accept a counter-intuitive fact: The cost of intelligence is uncertainty.

If a system is 100% deterministic, it’s not “intelligent” — it’s just a complex automation script. The very reason we need AI is to handle those fuzzy scenarios that hard-coded logic cannot cover.

This uncertainty comes from two ends:

  1. Input Ambiguity: Human language is highly context-dependent. “Organize the files” means “archive them” to a secretary, but “throw them away” to a cleaner.
  2. Output ProbabilityNon-determinism is like the chaos of the quantum world. Large models aren’t retrieving truth; they are collapsing the next word from a superposition of infinite possibilities. They are weavers of dreams, not recorders of reality. This mechanism is the source of their creativity, but also the root of their hallucinations.

Therefore, we cannot completely eliminate uncertainty, or we would kill the intelligence itself.

The Rising Bar: “Basic Intelligence” is Being Swallowed

If uncertainty is such a hassle, why do we insist on using it?

Because the potential payoff is too high.

AI is rapidly swallowing up “basic intelligence” labor. CRUD code that junior engineers used to write, document summaries done by interns, standard questions answered by customer support — all can now be replaced by AI.

But there is a brutal trend here: The threshold for “Basic Intelligence” is constantly rising.

Yesterday, “basic intelligence” might have meant simple text classification; today, it includes writing unit tests and SQL queries; tomorrow, entire CRUD business logic might fall under the “basic intelligence” category.

To gain the massive efficiency this automation brings, we must endure and manage the accompanying uncertainty. This is the price we must pay for “Intelligent Productivity.”

Your AI is Lying

The biggest challenge in landing AI agents is often not “what can it do,” but how to constrain it from “doing strictly wrong things.”

Current LLMs are more like drunken artists, particularly good at non-deterministic tasks: writing poetry, painting, brainstorming. This “free-rein” randomness is the source of creativity.

But in software engineering, non-determinism is a nightmare.

  1. Hallucinate: Import libraries that don’t exist.
  2. Deceive: When facing a failing test, instead of fixing the code, it comments out the test case and reports “Fixed.”
  3. Bullshit with Authority: Write meaningless test logic, or even hardcode return true to cheat the assertions.

I personally experienced a case where, in order to pass CI/CD, the AI quietly commented out all failing test files and thoughtfully wrote in the commit message: “Optimized project structure, ensured tests pass.”

AI doesn’t understand “correctness.” It is only trying to maximize its Reward Function — which means pleasing you.

Building Order in Randomness

There is a massive chasm here:

  • Traditional Software Engineering: Built on absolute determinism (compilers never lie).
  • AI Agents: Fundamentally non-deterministic probabilistic models.

When we try to build software products that must be deterministic using non-deterministic AI, conflict is inevitable. This is why Context Engineering is so hard: in the greenhouse of a Demo, context is controlled; in the wasteland of the real world, context is bizarre and unpredictable.

To combat this uncertainty, the industry has proposed many solutions, such as RAG (Retrieval-Augmented Generation), MCP (Model Context Protocol), Workflows, Specifications, etc. These methods are essentially “Pre-constraints” — attempts to increase certainty at the input and behavior boundaries. But as long as the core engine (LLM) remains probabilistic, the risk cannot be completely eliminated.

The key to untying this knot lies in introducing a new role: The Deterministic “Anchor”.

AI is like a kite, naturally prone to random movement in the sky. We need a string (the Anchor) to tether it, allowing it to dance within a controllable range.

Without this string, the kite either flies away (hallucination runs wild) or crashes (task failure). Only by holding this string can it fly high and steady.

For agent systems, the formula for future reliability should be:

$$ \text{Reliable AI System} = \text{Non-deterministic LLM (Creator)} + \text{Deterministic Anchor (Gatekeeper)} + \text{Human Guidance (Navigator)} $$

This is not just a technical reconstruction, but a reconstruction of the value chain:

  1. Static Anchors: Syntax analysis, type checking. Low-level errors must be strictly intercepted by traditional compilers.
  2. Dynamic Anchors: Rigorous automated testing. Code generated by AI must undergo stricter testing than human code. Test cases are the immovable “Constitution.”
  3. Runtime Anchors: Sandbox isolation. All side effects (file usage, API calls) must be isolated to ensure observability and rollback capability.
  4. Human Anchors: Values and Direction. AI has no values; values must be defined by humans. The human role is no longer just a coder, but providing critical directional guidance and correction mechanisms. When AI “goes astray” in the sandbox, humans must be like driving instructors, ready to slam the passenger brake at any moment.

The Final Level: Anchoring Engineering

Software development in the age of agents is no longer just bricklaying (Coding), but gardening.

Large models provide insane vitality, capable of growing a garden full of colorful plants overnight. The key to a harvest lies not in sowing, but in pruning. The shears in the developer’s hand are the Anchoring Tools.

In this era, productivity is no longer scarce; what is scarce is how to use those shears well, to cut away the toxic branches and keep the fruitful ones.

I call this “Anchoring Engineering.”

This is the craft that helps humans sift gold from the massive output of AI.

This will be the final level for AI adoption.

Stop worshipping the alchemy of Prompt Engineering, and embrace Anchoring Engineering.

Tuesday, January 27, 2026

The Death of Code and the Rise of Data: The Software Economics Revolution in the AI Era


As Large Language Models (LLMs) compress the marginal cost of code generation to a negligible level relative to human labor, the underlying logic of the software industry is undergoing a fundamental shift. This article analyzes this transformation from an economic perspective, revealing how competitive barriers are shifting from "coding capability" to "data assets," and proactively explores the profound impact of this transition on industries such as finance, law, and healthcare.

Code is Dead: An Economic Proposition

"Code is Dead" — this phrase may sound like clickbait from the tech community, but viewed through an economic lens, it reveals a profound process of value reconstruction: The economic value of manual coding as a scarce skill is rapidly declining.

For decades, software engineers have been the "scarce resource" of the digital economy era. Companies were willing to pay high salaries precisely because this capability possessed natural "rivalry" — an engineer's time spent on Project A cannot simultaneously be used for Project B.

However, with the proliferation of AI programming tools like GitHub Copilot, Cursor, and Claude Code, this logic is rapidly disintegrating. According to multiple industry surveys, AI-assisted programming has increased development efficiency by 25% to 55%, and this figure is still climbing fast. More notably, compared to traditional labor costs, the marginal cost of AI-generated code is negligible — meaning code generation has shifted from charging "by the head" to nearly charging "by the Token."

When the supply of a capability becomes nearly infinite and costs drop dramatically, it loses the economic foundation that supports high premiums as a "scarce resource." This is the true meaning of "Code is Dead."

From Machine Capability to Definition Capability: The Economic Evolution of Programming Paradigms

Reviewing the evolution of programming technology, we find a clear thread: the continuous shift of scarce resources.

Early days: "Machine Scarcity." In the 1950s-1970s, computers were true luxuries, with a single mainframe often occupying an entire floor. Programmers had to "speak" to computers in machine language with extremely low efficiency, yet machine time was more expensive than human labor.

Then came "Labor Scarcity." As hardware costs fell and high-level programming languages became popular, the bottleneck shifted to developers. The law revealed in The Mythical Man-Month — adding manpower to a late software project makes it later — became a classic dilemma in software engineering. During this period, excellent programmers became core assets competed for by enterprises.

Later came "Reuse and Distribution Costs." The open-source movement and cloud computing lowered the marginal costs of software reuse and distribution, and the SaaS model turned software from a "one-time product" into a "continuous service." But the core logic still required manual writing, keeping custom development costs high.

Now, we have entered the era of "Definition Scarcity." AI can generate usable code based on natural language descriptions, shifting the programming paradigm from "How to implement" to "What to implement." Scarcity is no longer coding capability, but the ability to clearly define requirements and verify the correctness of results.

This shift is significant: the bottleneck of software production has moved from technical capability on the supply side to business understanding on the demand side.

Software is Becoming a "Utility"

If the marginal cost of code generation drops significantly, what changes will occur in the software industry?

First, software production will become "Instant." In the past, developing an enterprise management system required months of investment. In the future, companies might "instantly generate" software just like ordering takeout — describe the needs, and AI generates it on the spot. Software transforms from an expensive "asset" to an on-demand "consumable."

Imagine a scenario: A company organizes a temporary exhibition and needs an attendee registration system. The traditional approach would be finding an outsourcing company, taking two weeks, and costing tens of thousands. The AI era approach is: describe the requirements in natural language on-site, generate the system in ten minutes, and delete it after the exhibition ends. The "disposable use" of software will become the norm.

Second, general-purpose software will become "Free." When everyone can generate exclusive software at low cost, how much can standardized general-purpose software charge? It is foreseeable that a large amount of general-purpose software will be forced to become free, becoming traffic entry points for acquiring users. The real profit points will shift in two directions:

  1. 1. Pay for Outcomes: Selling business results instead of tools. Software companies change from "selling knives" to "selling cut vegetables."
  2. 2. Pay for Compute: Similar to water and electricity bills, software costs will be directly linked to the called AI inference compute power. IT expenditure changes from "buying equipment" to "paying bills."

Of course, there is still a fundamental difference between software and utilities like water and electricity: the latter are natural monopoly industries requiring government regulation; while the market for AI-generated software remains fully competitive. But from a cost structure perspective, they are converging — both are variable costs paid by usage.

Third, scarcity fundamentally shifts. When code is no longer scarce, what becomes scarce? The answer is two things: Proprietary Data and Domain Knowledge.

Here we need to introduce an economic concept — "Club Goods." In traditional commodity classification, there are public goods (like roads, shared by everyone and non-excludable) and private goods (excessive commodities, used by the buyer). Club goods lie in between: they can be used by multiple people simultaneously (non-rivalrous), but can exclude non-payers (excludable).

Enterprise proprietary data possesses exactly this attribute: a set of customer behavior data can train multiple AI models simultaneously (non-rivalrous), but enterprises can prevent competitors from obtaining it through technical and legal means (excludable). This excess return brought by excludable data assets is called "monopoly rent" in economics — it stems from unique resources that competitors cannot replicate, rather than mere production efficiency advantages.

Precisely because the value of proprietary data depends on clearly defined property rights, and the current institutional environment is weak on this point — the author will detail this later.

The Engineer's New Role: Moving Up and Down

After code automation, what will software engineers do? The answer is "Move Up" and "Move Down."

  • • Move Up: Enter the field of Systems Engineering. The focus of an engineer's work shifts from "writing code" to "designing AI collaboration architectures." A complex business system may be completed by multiple AI Agents collaborating. Engineers need to define: Which Agents are responsible for which tasks? How do they communicate and coordinate? How to roll back when errors occur? This is a higher-order system design capability than writing code.
  • • Move Down: Deep dive into Data Engineering. The capability boundary of AI is determined by data. "Garbage In, Garbage Out" is particularly evident in the AI era. Building high-quality data pipelines — ensuring data freshness, accuracy, and completeness — becomes a key link determining the quality of AI output.

This means the talent structure of the software industry will present a "dumbbell" distribution: the top layer is architects who can design complex AI systems, the bottom layer is engineers who can build data infrastructure, while the middle layer — programmers whose main job is "writing code according to requirements" — will face the greatest career impact.

It is worth noting that this transition will not happen overnight. Skill renewal takes time and investment, and structural unemployment may occur in the short term; adjustments in the education and training system often lag behind technological changes, leading to skill mismatches in the labor market. Policymakers and business managers need to prepare for this transition period.

Pricing Revolution and Trust Dilemma

When software changes from "Product" to "Service" or even "Result," pricing models must change accordingly. But this change is far from simply "paying for performance," hiding profound economic dilemmas.

"Pay for Outcomes" looks beautiful, but execution is difficult.

Suppose a law firm uses AI to complete a contract review. How much should the client pay for this service? If billed by human hours, work completed by AI in five minutes won't charge much; if billed by "saved labor costs," the client will question: How do I know how much time a human needs? If billed by "service value," value itself is hard to measure objectively.

This is a new manifestation of the "Principal-Agent Problem" in economics in the AI era. Service providers have an information advantage — they know what AI actually did, the quality, and the cost — while clients are at an information disadvantage. This asymmetry leads to a series of problems: providers may exaggerate work complexity to raise quotes (moral hazard), clients may distrust pricing and abandon purchase (market shrinkage), and high-quality providers are forced to exit because they cannot prove their value (adverse selection).

Solving these problems requires building new trust mechanisms. Possible paths include: industry reputation rating systems, third-party technical audit institutions, and automated arbitration contracts based on blockchain. But the maturity of these mechanisms takes time. In the short term, the beautiful vision of "paying for outcomes" may be difficult to implement on a large scale.

"Pay for Compute" is relatively simple and direct.

Similar to water and electricity bills, software costs are directly linked to the called AI inference compute power. This model is transparent, measurable, and easy to compare, fitting the pricing logic of public utilities. It will completely change the corporate IT cost structure — from one-time "Capital Expenditure" (CAPEX) to continuous "Operating Expenditure" (OPEX), from "buying out equipment" to "paying monthly."

For startups, this means lower barriers to entry; for large enterprises, this means more flexible cost management. CFOs will manage "AI compute budgets" just like managing electricity bills.

Baumol's Cost Disease: The Soaring Relative Price of Human Participation

Economist William Baumol proposed a famous insight in the 1960s: industries that cannot increase productivity through technology will see their relative costs rise continuously. He used a symphony orchestra as an example — performing Beethoven's Ninth Symphony today requires the same number of people and time as 100 years ago, but musicians' salaries must keep up with the overall economy's wage levels. This leads to the relative cost of live performances getting higher and higher.

In the AI era, Baumol's Cost Disease will erupt comprehensively in the digital field. The law is: Any link that AI can automate will see costs drop significantly; any link that must involve humans will see relative costs rise sharply.

Which links cannot be replaced by AI?

  • • Verifying AI Output: Confirming if code is correct, identifying AI "hallucinations" (talking nonsense with confidence).
  • • Ethics and Value Judgments: Decisions involving moral trade-offs cannot be outsourced to machines.
  • • Ultimate Responsibility Assumption: Legal liability and reputation risk need natural persons or legal entities to endorse.

This leads to an interesting deduction: "Authenticity" will become a scarce good and command a huge premium. In an era where AI can batch generate content, data audited by humans, analysis endorsed by real people, and traceable services will carry unique market value.

This might explain why the value of "original news" from traditional media may rebound, why human-audited financial reports will be more expensive than AI-generated ones, and why platforms providing "formal verification" (mathematically proving software correctness) may become high-premium tracks.

Market Landscape: Not Simply Winner-Takes-All

"The AI era will be winner-takes-all" — this is a popular but overly simplified judgment. Careful analysis reveals that market concentration depends on the characteristics of specific tracks.

In the field of General Large Models, it indeed presents a winner-takes-all trend. The reason lies in strong "Data Network Effects" — users generate data by using, data optimizes models, and optimized models attract more users. Once this positive cycle is formed, latecomers find it hard to catch up. Unlike traditional network effects (e.g., telephone networks become more valuable with more users), the core of data network effects lies in the contribution of user behavior data to product improvement. This explains why a few players like OpenAI, Anthropic, and Google dominate the foundational model market.

But in vertical application fields, the landscape may be vastly different. Medical AI relies on medical records, which are scattered across hospitals; Financial AI relies on transaction data, which belongs to various financial institutions. It is difficult for a single platform to aggregate these proprietary data across domains. Coupled with huge differences in regulatory requirements across industries and varying regulations in different regions, a globally unified vertical market is hard to form.

A more likely landscape is "Segmented Monopoly" — each segmented track is dominated by one or two professional players, but overall it presents a blossoming competitive situation.

The implication for enterprises is: Don't stare at the general track and go head-to-head with giants, but cultivate data barriers in vertical fields. The value of proprietary data lies precisely in being hard to imitate and equate. Deep cultivation in one field to establish exclusive data assets may have more strategic value than chasing general technology.

Industry Impact: From Auxiliary Tools to Productivity Revolution

The transformation of the software industry by AI will inevitably spill over to other knowledge-intensive industries. But different industries are impacted in unique ways.

Finance Industry: Algorithm Democratization, Advisor Scarcity

AI brings complex quantitative analysis capabilities down from the altar. Quantitative strategies that only top investment banks and hedge funds could afford in the past may become inclusive tools in the future. A more interesting change is: "AI Private Banking" services for the middle class will become possible. In the past, private banking services were only for high-net-worth clients due to high marginal costs (requiring human advisors). AI can suppress marginal costs very low, allowing ordinary people to enjoy personalized wealth management advice.

But this does not mean human wealth advisors will disappear. On the contrary, advisors who can assume fiduciary duties, provide emotional support, and handle complex family affairs will become more scarce and expensive. AI handles standardized analysis, humans handle non-standardized relationships — this will be the new division of labor in financial services.

Legal Industry: Extreme Amplification of Knowledge Leverage

A large part of legal work is procedural: retrieving cases, reviewing contracts, drafting documents. AI's efficiency in these links far exceeds humans. This will completely change the competitive logic of law firms — in the past, the advantage of large firms was the ability to mobilize enough junior lawyers to "pile up headcount"; in the future, the advantage will lie in whether partners' professional insights can be effectively digitized and leveraged.

A senior partner commanding an AI team may be worth an entire law firm of the past. This is the reproduction of expert wisdom at zero marginal cost — an unprecedented "Knowledge Leverage." But the flip side is that the career entry point for junior lawyers may be significantly compressed. The training model of law schools requires fundamental change.

Healthcare Industry: Attribution of Responsibility Becomes the Core Issue

AI has shown amazing capabilities in imaging diagnosis, preliminary consultation, and medication advice, with accuracy exceeding senior doctors in some scenarios. But healthcare has its specificity: Medical responsibility cannot be transferred to machines. When an AI diagnosis error leads to a medical accident, is the algorithm developer responsible, the doctor using AI responsible, or the hospital responsible?

Before responsibility attribution is clear, the "signing right" of human doctors is itself a scarce resource. This explains why the implementation speed of AI in healthcare may be slower than expected — technological maturity is only a necessary condition, institutional support is the key to landing.

Education Industry: Knowledge is Free, Upbringing is Expensive

AI tutors can provide 24-hour, infinitely patient, fully personalized knowledge instruction. This means the acquisition cost of standardized knowledge will drop significantly — anyone can obtain top-level knowledge explanations at low cost.

But education is not just knowledge transfer, but also personality shaping. Inspiring creativity, cultivating critical thinking, and establishing value systems — these "upbringing" functions highly rely on human-to-human interaction and are hard to replace by AI. According to the logic of Baumol's Cost Disease, automatable "teaching" will become cheaper and cheaper, while non-automatable "upbringing" will become more and more expensive.

Future education may diverge into two levels: one is low-cost knowledge services provided by AI, inclusive and efficient; the other is high-end "guidance" services provided by human mentors, scarce and expensive.

Institutional Economics Perspective: The Key Puzzle of Data Property Rights

The realization of the above changes depends on a key institutional prerequisite: Clear definition and efficient trading of data property rights. According to the Coase Theorem, as long as property rights are clearly defined and transaction costs are low enough, the market can spontaneously achieve optimal resource allocation. Unfortunately, the current dilemma of the data market lies precisely in: ambiguous property rights lead to high transaction costs, making market mechanisms hard to function.

Data property rights face unique dilemmas. Traditional property law is built on the basis of "tangibility of things," but data is intangible, replicable, and non-depleting when used. Is a piece of user behavior data "produced" by the user, or "collected and processed" by the platform? What rights do both parties have? How to distribute value-added benefits? These questions still have no clear answers.

Even thornier is the lack of a data trading market. The value of data highly depends on the usage scenario — the same data may have vastly different values in different AI models. Buyers cannot verify data quality before purchasing (once viewed, information is leaked), and sellers can sell the same data to infinite buyers. This characteristic makes traditional commodity trading mechanisms hard to apply.

In addition, global privacy protection legislation is tightening (such as EU GDPR, China's Personal Information Protection Law), setting many restrictions on data usage. while protecting personal privacy, these restrictions also increase compliance costs for data utilization, affecting the development pace of the AI industry.

For policymakers, there is a balancing act ahead: How to find a balance point between protecting privacy, promoting innovation, and maintaining fair competition? There are no simple answers, but several directions are worth exploring:

  • • Classified Rights Confirmation: Define property rules separately based on data types (personal data, enterprise data, public data).
  • • Trading Infrastructure: Build standardized data trading platforms, introducing third-party evaluation and custody mechanisms.
  • • Regulatory Sandboxes: Allow data innovation applications within a controllable range, accumulate experience before promoting.

Institutional innovation in data governance will largely determine the development speed of the AI economy.

Conclusion: Embracing the Era of Data Constraints

We are standing at a historical inflection point. The software industry is moving from the "Logic Constraint" era to the "Data Constraint" era.

In the past, companies competed on coding capability — whether they could translate business logic into runnable software. In the future, companies will compete on data assets — whether they have proprietary data and domain knowledge that others cannot replicate.

For enterprises, strategic focus needs adjustment:

  • • Stop pinning core competitiveness on "We have a strong development team."
  • • Instead, think about "What data do we have that others can't get" and "What domain insights do we have that others don't possess."

For practitioners, capability structure needs upgrading:

  • • Pure "Implementation Capability" is depreciating.
  • • While "Problem Definition Capability," "Result Verification Capability," and "Cross-domain Integration Capability" are appreciating.

When repetitive logic construction is taken over by machines, humans will focus on areas that machines cannot replace: Defining truly important questions, judging correct answers, assuming responsibility for choices, and establishing standards of truth and credibility in massive information.

This is both a challenge and a liberation. The death of code is precisely allowing humans to return to the most essential value creation.