Tried it out, works flawlessly. The basic process cycle is clean and easy to follow. Kept it to CC Haiku with a bit of discussion on approach.
Only thing that wasn't 100% clear was the locking mode. Do I have to lock before games start or will it just auto-lock whatever I have? Claude assumed it would auto-lock.
I would actually be neat to have human-picked brackets in here too, or at least import a few expert-picked brackets from various sources for comparison.
I wonder if the edge here is not going to come down to which model you choose, but which sources of information you give it. You'll want stats on every team and player, injuries, and expert analysis, because none of this season is going to be in the training sets.
It would be interesting to have a couple of "control" brackets, like one that simply picks a random winner for each game and one that always picks the highest seed as the winner for each game.
Designing interactions for autonomous agents is tricky — you can’t assume a human will click through a UI. I’ve been experimenting with autonomous scientific agents: a lightweight Python system that uses sparse regression to derive physical laws from raw data. It was able to estimate the Sun’s ~27‑day rotation period to within 93 % accuracy and even found a temperature ∝ v^3.40 power law in solar‑wind measurements. Experiences like yours building an API‑first bracket challenge mirror the same need: build clear machine‑readable interfaces so agents can focus on analysis, not wrestling with front‑end logic.
Love it! Just this morning I asked my claw to fill out a bracket on ESPN and invited it to join a group with me. It was a bit clunky (Disney's signup within an iframe was tricky and navigating the bracket to make picks with JS took a few repeated tries) but felt pretty science-fiction when it actually worked.
For sure it was overkill/not the most efficient approach - really I was more just curious if it would work. The answer was "kind of", but even that is pretty amazing. I can't imagine telling myself 5 years ago that I could text a computer and have it fill out its own bracket on a commercial site like ESPN.
I'm usually pretty opinionated on using AI for reasons I generally view as productive - for example, not moltbook - however this is actually really neat and doesn't require a ton of token usage assuming you don't instruct your agent to do multiple turns of analysis on the stats :)
It'll be interesting to see what strategies agents choose to implement & whether there are any meaningful trends.
Really cool idea. My son is using different LLMs to fill out brackets for his 4th grade science experiment, and then we are going to compare them to the experts. I like your idea of Strategy/Inspiration prompting, we had to tell them that "upsets happen" because all the favorites were picked on first pass.
Tangentially, I wonder if we are going to see AI predictions impact point spreads.
Very cool. I was trying to do something similar (not for march madness brackets), but ran into a problem with chatbots in that they wouldn't follow URLs that weren't provided directly by the user (claude would but only whitelisted sites), so I couldn't get it to do actual POSTs etc. for authentication. Claude.ai would instead create react app (fragments). I eventually built a remote MCP for it, but a HATEOS styled REST API would be far preferable.
No - I think we are months (weeks/Days?) away from chatbots being able to interact with apis, so thats why i limited it to just agents that have abilities to write apis.
I tried to get it so that people could paste chatbot written json into a submission form but that is less elegant. So now i have a zoom call set up with my dad so he can install CC lol
OK so the same issue I ran into. I ended up creating a remote MCP (basically like a REST API) that does oauth 2.1. If interested you can check it out here: https://github.com/pairshaped/gleam-mcp-todo
Tried it out, works flawlessly. The basic process cycle is clean and easy to follow. Kept it to CC Haiku with a bit of discussion on approach.
Only thing that wasn't 100% clear was the locking mode. Do I have to lock before games start or will it just auto-lock whatever I have? Claude assumed it would auto-lock.
it does auto lock - i dont know why i included that feature.
thanks for the feedback!
I would actually be neat to have human-picked brackets in here too, or at least import a few expert-picked brackets from various sources for comparison.
I wonder if the edge here is not going to come down to which model you choose, but which sources of information you give it. You'll want stats on every team and player, injuries, and expert analysis, because none of this season is going to be in the training sets.
the edge is going to come down to variance just as God intended
lol
good idea!
It would be interesting to have a couple of "control" brackets, like one that simply picks a random winner for each game and one that always picks the highest seed as the winner for each game.
Designing interactions for autonomous agents is tricky — you can’t assume a human will click through a UI. I’ve been experimenting with autonomous scientific agents: a lightweight Python system that uses sparse regression to derive physical laws from raw data. It was able to estimate the Sun’s ~27‑day rotation period to within 93 % accuracy and even found a temperature ∝ v^3.40 power law in solar‑wind measurements. Experiences like yours building an API‑first bracket challenge mirror the same need: build clear machine‑readable interfaces so agents can focus on analysis, not wrestling with front‑end logic.
Love it! Just this morning I asked my claw to fill out a bracket on ESPN and invited it to join a group with me. It was a bit clunky (Disney's signup within an iframe was tricky and navigating the bracket to make picks with JS took a few repeated tries) but felt pretty science-fiction when it actually worked.
I thought about using claw but felt like overkill and wonder if an AI browser (atlas etc) would do the trick.
For sure it was overkill/not the most efficient approach - really I was more just curious if it would work. The answer was "kind of", but even that is pretty amazing. I can't imagine telling myself 5 years ago that I could text a computer and have it fill out its own bracket on a commercial site like ESPN.
such a fun idea! I like your solution for detecting when agents are reading your page vs. humans.
I'm usually pretty opinionated on using AI for reasons I generally view as productive - for example, not moltbook - however this is actually really neat and doesn't require a ton of token usage assuming you don't instruct your agent to do multiple turns of analysis on the stats :)
It'll be interesting to see what strategies agents choose to implement & whether there are any meaningful trends.
Really cool idea. My son is using different LLMs to fill out brackets for his 4th grade science experiment, and then we are going to compare them to the experts. I like your idea of Strategy/Inspiration prompting, we had to tell them that "upsets happen" because all the favorites were picked on first pass.
Tangentially, I wonder if we are going to see AI predictions impact point spreads.
I know multiple people that are building arbitrage models with their agents. i bet it makes the markets pretty efficient
oh I love this, MoltFire about to wax that ass! What's first get? $100 in Claude tokens?
bragging rights!
Very cool. I was trying to do something similar (not for march madness brackets), but ran into a problem with chatbots in that they wouldn't follow URLs that weren't provided directly by the user (claude would but only whitelisted sites), so I couldn't get it to do actual POSTs etc. for authentication. Claude.ai would instead create react app (fragments). I eventually built a remote MCP for it, but a HATEOS styled REST API would be far preferable.
Any tips?
No - I think we are months (weeks/Days?) away from chatbots being able to interact with apis, so thats why i limited it to just agents that have abilities to write apis.
I tried to get it so that people could paste chatbot written json into a submission form but that is less elegant. So now i have a zoom call set up with my dad so he can install CC lol
OK so the same issue I ran into. I ended up creating a remote MCP (basically like a REST API) that does oauth 2.1. If interested you can check it out here: https://github.com/pairshaped/gleam-mcp-todo