Thicket data repository for the EEG
at main 6.9 kB view raw
1{ 2 "id": "https://mort.io/blog/whither-ai/", 3 "title": "Whither AI?", 4 "link": "https://mort.io/blog/whither-ai/", 5 "updated": "2025-07-05T00:00:00", 6 "published": "2025-07-05T00:00:00", 7 "summary": "<p>I am hardly the first person to comment<a href=\"https://mort.io/blog/whither-ai/#1\">1</a> on this – I am given to understand\nAI has been a topic of some interest to many for a few years now. I’m sure I’ve\nseen, and possibly even <a href=\"https://mastodon.me.uk/@mort\">re-tooted</a> things about\nit in fact. I’m afraid I just don’t keep up.</p>\n<div>1\n<p>Ok fine. I admit it. This is a rant.</p>\n</div>\n<p>But recent experiences reviewing for a couple of systems/networking venues has\nled me to feel I need to ask: <strong>WHY</strong>? More pointedly, why does the following\nseem like good motivation for a research paper?</p>\n<ol>\n<li>There is a complex and important task that currently requires considerable\nexpertise to carry out because it is important to be precise and get it\nright.</li>\n<li>The task in question can be described imprecisely using natural language by\nnon-experts.</li>\n<li>AI (inevitably, some large-language model) can take that natural language\ndescription and, after training, produce some output that is stochastically\nlike unto what an expert might produce given the same underlying problem,\nhaving brought to bear their expertise.</li>\n<li>Thus we build an AI that can take the non-expert’s imprecise description and\nshow that sometimes the output it produces is not so wrong as to fail some\n<em>ad hoc</em> tests of utility that we introduce.</li>\n</ol>\n<p>Based on things I’ve recently reviewed “not so wrong” above means “error rate of\nno more than 25—30% when taking expertly generated natural language prompts as\ninput”. Which is to say, probably not the sorts of input prompt that a\nnon-expert might produce.</p>\n<p>Network configuration and management is the domain I’ve seen this argument made\nin most recently. Which seems quite strange to me because I always thought that\na 25% error rate in configuring, e.g., your enterprise network security\nperimeter would be bad. But apparently not if it’s done by an AI.</p>\n<p>More generally, why do we want to build tools that allow untrained experts to do\na job when mistakes are high impact, it requires a trained expert to detect\nthose mistakes, and those tools <em>by design</em> only produce statistically valid\noutput? An error rate of once in a blue moon is categorically worse than a zero\nerror rate if the error involved can leave your entire digital estate open to\ncompromise.</p>\n<p>If the big issue here is that experts sometimes make typos when editing the\nconfiguration files, maybe building some domain-specific languages or better\nuser interfaces or verification techniques or other tooling would be a better\nway to help them not do that than replacing them with tools that <strong>by design</strong>\nare only ever probably about right.</p>\n<p>So please stop justifying your AI application research by saying simply that it\nallows non-experts to carry out expert work! I’m much more likely to be\nconvinced by uses of AI that make experts <em>more productive</em> – though don’t get\nme started on how to measure productivity because I don’t know except via means\nwhich are expensive and time consuming, and it really seems that very few people\ncan be bothered doing that.</p>", 8 "content": "<p>I am hardly the first person to comment<a href=\"https://mort.io/blog/whither-ai/#1\">1</a> on this – I am given to understand\nAI has been a topic of some interest to many for a few years now. I’m sure I’ve\nseen, and possibly even <a href=\"https://mastodon.me.uk/@mort\">re-tooted</a> things about\nit in fact. I’m afraid I just don’t keep up.</p>\n<div>1\n<p>Ok fine. I admit it. This is a rant.</p>\n</div>\n<p>But recent experiences reviewing for a couple of systems/networking venues has\nled me to feel I need to ask: <strong>WHY</strong>? More pointedly, why does the following\nseem like good motivation for a research paper?</p>\n<ol>\n<li>There is a complex and important task that currently requires considerable\nexpertise to carry out because it is important to be precise and get it\nright.</li>\n<li>The task in question can be described imprecisely using natural language by\nnon-experts.</li>\n<li>AI (inevitably, some large-language model) can take that natural language\ndescription and, after training, produce some output that is stochastically\nlike unto what an expert might produce given the same underlying problem,\nhaving brought to bear their expertise.</li>\n<li>Thus we build an AI that can take the non-expert’s imprecise description and\nshow that sometimes the output it produces is not so wrong as to fail some\n<em>ad hoc</em> tests of utility that we introduce.</li>\n</ol>\n<p>Based on things I’ve recently reviewed “not so wrong” above means “error rate of\nno more than 25—30% when taking expertly generated natural language prompts as\ninput”. Which is to say, probably not the sorts of input prompt that a\nnon-expert might produce.</p>\n<p>Network configuration and management is the domain I’ve seen this argument made\nin most recently. Which seems quite strange to me because I always thought that\na 25% error rate in configuring, e.g., your enterprise network security\nperimeter would be bad. But apparently not if it’s done by an AI.</p>\n<p>More generally, why do we want to build tools that allow untrained experts to do\na job when mistakes are high impact, it requires a trained expert to detect\nthose mistakes, and those tools <em>by design</em> only produce statistically valid\noutput? An error rate of once in a blue moon is categorically worse than a zero\nerror rate if the error involved can leave your entire digital estate open to\ncompromise.</p>\n<p>If the big issue here is that experts sometimes make typos when editing the\nconfiguration files, maybe building some domain-specific languages or better\nuser interfaces or verification techniques or other tooling would be a better\nway to help them not do that than replacing them with tools that <strong>by design</strong>\nare only ever probably about right.</p>\n<p>So please stop justifying your AI application research by saying simply that it\nallows non-experts to carry out expert work! I’m much more likely to be\nconvinced by uses of AI that make experts <em>more productive</em> – though don’t get\nme started on how to measure productivity because I don’t know except via means\nwhich are expensive and time consuming, and it really seems that very few people\ncan be bothered doing that.</p>", 9 "content_type": "html", 10 "author": { 11 "name": "Unknown", 12 "email": null, 13 "uri": null 14 }, 15 "categories": [], 16 "source": "https://mort.io/atom.xml" 17}