nightshift1 3 days ago

I think that letting an LLM run unsupervised on a task is a good way to waste time and tokens. You need to catch them before they stray too far off-path. I stopped using subagents in Claude because I wasn't able to see what they were doing and intervene. Indirectly asking an LLM to prompt another LLM to work on a long, multi-step task doesn't seem like a good idea to me. I think community efforts should go toward making LLMs more deterministic with the help of good old-fashioned software tooling instead of role-playing and writing prayers to the LLM god.

  • theshrike79 3 hours ago

    There are two opposite ways to do this.

    Codex is like an external consultant. You give it specs and it quietly putters away and only stops when the feature is done.

    Claude is built more like a pair programmer, it displays changes live, "talks" about what it's doing and what's working et.

    It's really, REALLY hard to abort codex mid-run to correct it. With Claude it's a lot easier when you see it doing something stupid or getting of the rails. Just hit ESC and tell it where it went wrong (like use task build, don't build it manually or use markdownlint, don't spend 5 minutes editing the markdown line by line).

  • danmaz74 3 days ago

    When the task is bigger than I trust the agent to work on it on its own, or for me to review the results, I ask it to create a plan with steps. Then create a md file for each step. I review the steps, and ask the agent to implement the first one. Review that one, fix it, then ask it to update the next steps, and then implement the next one. And so on, until finished.

    • anditherobot 3 days ago

      Have you tried Scoped context packages? Basically for each task, I create a .md file that includes relevant file paths, the purpose of the task, key dependencies, a clear plan of action, and a test strategy. It’s like a mini local design doc. I found that it helps ground implementation and stabilizes the output of the agents.

      • genghisjahn 3 days ago

        I read this suggestion a lot. “Make clear steps, a clear plan of action.” Which I get. But then instead of having an LLM flail away at it could we give to an actual developer? It seems like we’ve finally realized that clear specs makes dev work much easier for LLMs. But the same is true for a human. The human will ask more clarifying questions and not hallucinate. The llm will role the dice and pick a path. Maybe we as devs would just rather talk with machines.

        • FrinkleFrankle a day ago

          I'm using it to help me build what I want and learn how. It being incorrect and needing questioning isn't that bad, so long as you ARE questioning it. It has brought up so many concepts, parameters, etc that would be difficult to find and learn alone. Documentation can often be very difficult to parse. Llms make it easier.

        • redhale 2 days ago

          Yes, but the difference is that an LLM produces the result instantly, whereas a human might take hours or days.

          So if you can get the spec right, and the LLM+agent harness is good enough, you can move much, much faster. It's not always true to the same degree, obviously.

          Getting the spec right, and knowing what tasks to use it on -- that's the hard part that people are grappling with, in most contexts.

        • catlifeonmars 3 days ago

          > Maybe we as devs would just rather talk with machines.

          This is kind of how I feel. Chat as an interaction is mentally taxing for me.

    • thethimble 3 days ago

      Separately, you have to consider that "wasting tokens spinning" might be acceptable if you're able to run hundreds of thousands of these things in parallel. If even a small subset of them translate to value, then you're far net ahead vs with a strictly manual/human process.

      • pjc50 3 days ago

        > hundreds of thousands of these things in parallel

        At what cost,. monetary and environmental?

        • thethimble 2 days ago

          If the system provides value that is greater than its cost, then paying the cost to gain the value is always worthwhile - regardless of the magnitude of the cost.

          As costs drop exponentially (a reasonable expectation for LLMs, etc.) then increasing agent parallelism becomes more and more economically viable over time.

          • ahartmetz 2 days ago

            >As costs drop exponentially

            Not a reasonable expectation anymore. Moore's Law has been dead for more than a decade and we're getting close to physical limits.

    • sanex 3 days ago

      I do the same thing with my engineers but I keep the tasks in Jira and I label them "stories".

      But in all seriousness +1 can recommend this method.

    • meander_water 3 days ago

      This is built into Cursor now with plan mode https://cursor.com/docs/agent/planning

      • danmaz74 2 days ago

        How does Cursor plan mode differ from Claude Code plan mode? I've used the latter a lot (it's been there a long time), and the description seems very similar. The big difference with the workflow I described is that with that plan mode you don't get to review and correct what happened between steps.

        • meander_water 2 days ago

          I've not used Claude Code, so my answer might not be that useful. But I would think that because both are chat-based interfaces you would be able to instruct the model to either continue without approval or wait for your approval at each step. I certainly do that with Cursor. Cursor has also recently started automatically generating TODO lists in the background (with a tool call I'm assuming), and displaying them as part of the thinking process without explicit instruction. I find that useful.

    • spike021 3 days ago

      this plus a reset in between steps usually helps focus context in my experience

  • hu3 3 days ago

    Yeah in my experience, LLMs are great but they still need babysitting lest they add 20k lines of code that could have been 2k.

  • tummler 3 days ago

    I also use AI to do discrete, well-defined tasks so I can keep an eye on things before they go astray.

    But I thought there are lots of agentic systems that loop back and ask for approval every few steps, or after every agent does its piece. Is that not the case?

ripped_britches 3 days ago

Please comment under this thread if you have actually tried this and can compare it to another tool like Cursor, Codex, raw Claude, etc.

I’m super not interested in hearing what people have to say from a distance without actually using it.

  • payneio 20 hours ago

    I've tried it. It works better than raw Claude. We're working on benchmarks now. But... it's a moving target as amplifier (an experimental project) is evolving rapidly.

rco8786 3 days ago

A lot of snark in these comments. Has anyone actually tried it yet?

  • rs186 3 days ago

    The repo is full of big AI words without any metrics/benchmark.

    People are correct to question it.

    If anything, Microsoft needs to show something meaningful to make people believe it's worth trying it out.

    • rco8786 3 days ago

      I’m not blaming them. I’m asking if anyone has tried it.

  • SilverElfin 3 days ago

    I’ve seen people discuss these types of approaches on X. To me it looks like the concepts here are already tried and popular - they’re just packaging it up so that people who aren’t as deep in that world can get the same benefits. But I’m not an expert.

    • ridruejo 3 days ago

      Exactly. I don’t understand the cynicism in the comments and they literally are just trying to make the technology more accessible

      • nozzlegear 3 days ago

        That's a very altruistic outlook on Microsoft's intent with getting everyone to use and depend on AI.

        • otterley 3 days ago

          Isn’t that what every company that sells technology does—build demos and showcase uses in order to provoke the imagination and motivate sales? No company is perfect, but what Microsoft is doing here is hardly unusual.

        • vachina 3 days ago

          Microsoft is on a roll, on a roll at repackaging open source efforts and branding them, and then saying they made it.

        • username223 2 days ago

          I mean this in the best possible way, but I don't think you're using "altruistic" correctly. Altruism is "showing a selfless concern for the well-being of others." I think you're looking for "naive," and Microsoft is some combination of cynical and manipulative.

          • nozzlegear a day ago

            Good point, thanks! I meant to say that they were taking an outlook that cast Microsoft's intentions as altruistic when (in my view) the intentions are more along the lines of cynical and manipulative, as you said.

  • awy311 18 hours ago

    Claude Code is great, this is just a set of tweaks, not really "research". For anyone into vibe coding, there are dozens of interesting video tutorials on customizing Claude Code and running practical jobs, not limited to coding.

  • hansmayer 3 days ago

    I think most of us are irritated by the constant A/B Testing and underwhelming releases. Lets just have the bubble pop so we can solve real problems instead of this.

    • fishmicrowaver 3 days ago

      Hehe suddenly many people will have the real problem of paying bills unfortunately

      • Incipient 2 days ago

        I'm super confused how anyone can actually afford to pay per token for llms to actually do dev.

        I tried it with a feature, took about 10 minutes and a lot of iterations, and would easily have used hundreds of thousands of tokens. Doing this 20, 30 times a day would be crazy expensive.

      • hansmayer 3 days ago

        They will, especially when it comes to paying back the VCs all the burnt GenAI-dollars

        • bee_rider 3 days ago

          I thought VCs were investors.

          Generally when your investment fails you don’t get paid back, right?

          • yunnpp 3 days ago

            Where were you in 2008?

            • bee_rider 3 days ago

              Not in the workforce yet

  • ramraj07 3 days ago

    I have two hypotheses:

    1. It affects the fundamental ego of these engineers that a computer can do what they thought only they could do and what they thought made them better than the rest of the population. They might not realize this of course.

    2. AI and all these AI systems are intelligence multipliers, with a zero around IQ 100. Zero multiplied by zero is zero, and negative multiplier just leads to garbage. So the people who say "I used AI and its garbage" should really think hard about what it says about them. I thought I was crazy to think of this hypothesis but someone else also mentioned the exact statement and I didnt think I was just being especially mean anymore.

    • hansmayer 3 days ago

      Nothing to do with ego, but you may want to check your own projections, you know how when you speak of the others, you mainly speak of yourself (Jung or Freud, not sure). No need to be bitter about not having the grind and focus to become an engineer yourself, it is after all much harder than say, earning an MBA and you should be OK with whatever you turned out to be. Not to mention that the tools themselves, were in fact built by engineers and not by the "rest of the population", like yourself. Now having said that, I am early adopter myself, was happy to pay the premium costs for my entire company, if the tool was any kind of amplifier. But the crap just does not work. Recently the quality is degrading so much that we simply reduced it to using it for simple consultation - and we only do it because unfortunately the search has been ruined. Otherwise most of the folks both internally and externally that I know using these tools would be happy to just go back to google search and SO. Unfortunately that's not an option. Also see if your second argument makes any sense at all. Maybe it comes out of a lacking math background? Firstly, you don't need two zeroes to get a zero out at the end of the multiplication. And secondly, if an average engineer is a zero, what are folks like you then? But again, it maybe just your own projections...

      • ramraj07 3 days ago

        For some reason youre assuming Im not an engineer which is funny and revealing.

        I am an engineer and my vibe coded prototype is now in production, one of the best applications of its type in the industry, and doing really well. So well, I have a pretty large team working on it now. This project was and still is 95% written by AI. No complaints, never going back. That's my experience.

        Clearly the eng community is splitting into two categories, people who think this is all never going to work and people who think otherwise. Time will tell who's right.

        To anyone else reading and thinking closer to the second side, we're hiring :)

        • hansmayer 3 days ago

          Hey no need to prove yourself to a stranger on the Internet. I'll take your word for it, including your "pretty large team working on it", which for some reason is necessary, although you have "vibe coded 95%" of your application. So if you were to be taken by your word, the LLMs are fantastic and you can do 95% production-ready on your own, just using the LLMs, but for some reason, you'll still need a "pretty large team" to work on it afterwards. Yeah, that sounds very consistent with your main line. Also feel free to share your company and product name, so we can avoid it - thanks.

          • system2 3 days ago

            hansmayer, I think your "debate" skills need some improvement.

            >"no need to prove yourself to a stranger on the Internet"

            In this case, the stranger is you, and you say it as if he is debating with someone else. This proves you lost the debate and are now attacking personally.

            People can easily detect these little nuances.

        • LtWorf 3 days ago

          phd in medical engineering doesn't scream "computer science expert" to me.

          • ramraj07 3 days ago

            If youre gonna stalk, at the least do a good job. No wonder AI doesnt work!

        • bgwalter 3 days ago

          Which company so we can avoid it?

      • nsonha 2 days ago

        > Recently the quality is degrading so much

        You can say it sucked and still continue to suck, but that LLM/agentic AI is degrading is simply false. Such a statement really makes me question the genuinity of the rest of the comment.

        • hansmayer 2 days ago

          Well, unlike mine, "simply false" as we know, is the ultimate argument to prove one's point, isn´t it?

          • nsonha 2 days ago

            > unlike mine

            "unlike"? You made a claim and has nothing to show for either. Only difference is that the claim is actually way more ridiculous. Somehow technology advanced backward and you're the only one noticed.

            • hansmayer a day ago

              Keep punching those "arguments" sport :)

              • nsonha a day ago

                Claiming AI regresses is like claiming that computer chips are getting slower. It's way less believable than you sucking at the basic skills of using AI (keep context poisoning I'd imagine) and get to the point that YOUR AI set up degrades. Instead of learning how to do things properly you just keep expecting magic and blaming technology for, again, somehow, going backward.

                • hansmayer a day ago

                  > Claiming AI regresses is like claiming that computer chips are getting slower

                  Yeah, or maybe learn how transformer architecture with neural networks actually works and stop comparing apples to oranges.

                  > that YOUR AI set up degrades

                  Well, yeah, I did nothing but say all the time that the so-called "AI" has been degrading in my own setup, my entire company to be precise. Who knows, maybe myself and my team are just stupid and we "suck" at writing english-language sentences. Have a stroll through subreddits dedicated to cursor, claude, chatgpt. You could be onto something. So many stupid people vs. a few of those like you, who are apparently very smart. The masters of "Prompt Engineering". Or could it be, that your use-case is so trivial, and you so inexperienced, that whatever the statistical parrot spits out, seems like wonder to you?

    • Angostura 3 days ago

      You seem to be assuming that the negative multiplier is on the human side of the equation. There’s your mistake

    • milutinovici 3 days ago

      Alternative hypothesis is that you work on trivial problems, and therefore you get a lot of help from LLMs. Have you considered this?

      • ramraj07 3 days ago

        Im definitely not creating the next StuxNet for sure. So Ill bow down to whoever is writing the next C compiler I suppose.

    • bgwalter 3 days ago

      This is like a person who thinks that making a photocopy of an Einstein paper makes him Einstein. You know, Einstein wasn't that special after all and the photocopier affects his fundamental ego.

stillsut 3 days ago

I've actually written my own a homebrew framework like this which is a.) cli-coder agnostic and b.) leans heavily on git worktrees [0].

The secret weapon to this approach is asking for 2-4 solutions to your prompt running in parallel. This helps avoid the most time consuming aspect of ai-coding: reviewing a large commit, and ultimately finding the approach to the ai took is hopeless or requires major revision.

By generating multiple solutions, you can cutdown investing fully into the first solution and use clever ways to select from all the 2-4 candidate solutions and usually apply a small tweak at the end. Anyone else doing something like this?

[0]: https://github.com/sutt/agro

  • thethimble 3 days ago

    There is a related idea called "alloying" where the 2-4 candidate solutions are pursued in parallel with different models, yielding better results vs any single model. Very interesting ideas.

    https://xbow.com/blog/alloy-agents

    • stillsut 3 days ago

      Exactly what I was looking for, thanks.

      I've been doing something similiar: aider+gpt-5, claude-code+sonnet, gemini-cli+2.5-pro. I want to coder-cli next.

      A main problem with this approach is summarizing the different approaches before drilling down into reviewing the best approach.

      Looking at a `git diff --stat` across all the model outputs can give you a good measure of if there was an existing common pattern for your requested implementation. If only one of the models adds code to a module that the others do not, it's usually a good jumping off point to exploring the differing assumptions each of the agents built towards.

    • michaelbarton 3 days ago

      This reminds me of an an approach in mcmc where you run mutiple chains at different temperatures and then share the results between them (replica exchange MCMC sampling) the goal being not to get stuck in one “solution”

alganet 3 days ago

> "I have more ideas than time to try them out" — The problem we're solving

I see a possible paradox here.

For exploration, my goal is _to learn_. Trying out multiple things is not wasting time, it's an intensive learning experience. It's not about finding what works fast, but understanding why the thing that works best works best. I want to go through it. Maybe that's just me though, and most people just want to get it done quickly.

  • tclancy 3 days ago

    Yeah, this seems like the opposite of invention. You can throw paint at a canvas but it won’t make you Pollock. And will you feel a sense of accomplishment?

payneio a day ago

Hey all! I'm one of a handful of developers on this project. Great to see it's getting some interest!

For context, we are right in the middle of building this thing... multiple rebuilds daily since we are using it to build itself. The value isn't in the code itself, yet, but in the approaches (UNIX philosophy, meta-cognitive recipes, etc.)

We are really excited about how productive these approaches are even in this early stage. We are able to have amplifier go off make significant progress unattended for sometimes hours at a time. This, of course, raises a lot of questions on how software will be built in the near future... questions which we are leaning into.

Most of our team's projects, unless they have some unresolved IP or are using internal-only systems, are built in the open. This is a research project at this stage. We recognize this approach it too expensive and too hacky for most independent developers (we're spending thousands of dollars daily on tokens). But once the patterns are identified, we expect we'll all find ways to make them more accessible.

The whole point of this is to experiment and learn fast.

furyofantares 3 days ago

I do a lot of work with claude code and codex cli but frankly as soon as I see all the LLM-tells in the readme, and then all the commit messages written by claude, I immediately don't want to read the readme or try the project until someone else recommends it to me.

This is gaining stars and forks but I don't know if that's just because it's under the github.com/microsoft, and I don't really know how much that means.

  • nightshift1 3 days ago

    Future LLMs are going to be trained on this. Github really ought to start tagging repos that are vibe-coded.

  • typpilol 3 days ago

    I'd rather have in-depth commit messages then three word ones

    • furyofantares 3 days ago

      When I blind-commit claude code commit messages they are sometimes totally wrong. Not even hallucinations necessarily - by the time I'm committing the context may be large and confusing, or some context lost.

      I'd rather have the three word message than detailed but wrong messages.

      I think I agree with you anyway on average. Most of the time a claude-authored commit message is better than a garbage message.

      But it's still a red flag that the project may be filled with holes and not really ready for other people. It's just so easy to vibe your way to a project that works for you but is buggy and missing tons of features for anyone who strays from your use case.

      • typpilol 3 days ago

        You're not wrong.

        I'd never encourage anyone to blind commit the messages But if they are correct they seem a lot more useful than 90% of commit messages.

        I found the biggest mistakes that I've seen other people do are like - they move a file, and the commit message acts like it's a brand new feature they added because the llm doesn't put it together it's just a moved file

npalli 3 days ago

Contributors

claude Claude

Interesting given Microsoft’s history with OpenAI

vincnetas 3 days ago

Starting in Claude bypass mode does not give me confidence:

WARNING: Claude Code running in Bypass Permissions mode │ │ │ │ In Bypass Permissions mode, Claude Code will not ask for your approval before running potentially dangerous commands. │ │ This mode should only be used in a sandboxed container/VM that has restricted internet access and can easily be restored if damaged.

  • nine_k 3 days ago

    The Readme clearly states:

    Caution

    This project is a research demonstrator. It is in early development and may change significantly. Using permissive AI tools in your repository requires careful attention to security considerations and careful human supervision, and even then things can still go wrong. Use it with caution, and at your own risk.

    • vincnetas 3 days ago

      Claude Code will not ask for your approval before running potentially dangerous commands.

      and

      requires careful attention to security considerations and careful human supervision

      is a bit orthogonal no?

      • nine_k 3 days ago

        As a token of careful attention, run this in a clean VM, properly firewalled not to access the host, your internal network, GitHub or wherever your valuable code lives, and ideally anything but the relevant Anthropic and Microsoft API endpoints.

        • thethimble 3 days ago

          And even then if you give it Internet access you're at risk of code exfiltration attacks.

          • nine_k 3 days ago

            Definitely do not give it access to code you are afraid of leaking. Take an open-source code base you're familiar with, and experiment on that.

      • otterley 3 days ago

        It’s not orthogonal at all. On the contrary, it’s directly related:

        “Using permissive AI tools [that is, ones that do not ask for your approval] in your repository requires careful attention to security considerations and careful human supervision”. Supervision isn’t necessarily approving every action: it might be as simple as inspecting the work after it’s done. And security considerations might mean to perform the work in a sandbox where it can’t impact anything of value.

  • nicwolff 3 days ago

    I assumed, especially with the VS Code recommendation, that this would automatically use devcontainers...

  • cyral 3 days ago

    If they didn't have this warning you'd see comments on how irresponsible they are being

CuriouslyC 3 days ago

A lot of the ideas in this aren't bad, but in general it's hacky. Context export? Just use industry standard observability! This is so bad it makes me cringe. Parallel worktrees? These are prone to putting your repo in bad states when you run a lot of agents, and you have to deal with security, just put your agent in a container and have it clone the repo. Everything this project does it's doing the wrong way.

I have a repo that shows you how to do this stuff the correct way that's very easy to adapt, along with a detailed explanation, just do yourself a favor, skip the amateur hour re-implementations and instrument/silo your agents properly: https://sibylline.dev/articles/2025-10-04-hacking-claude-cod...

jug 3 days ago

I'll always be skeptical about using AI to amplify AI. I think humans are needed to amplify AI since humans are so far documented to be significantly more creative and proactive in pushing the frontier than AI. I know, it's maybe a radical concept to digest.

  • jsheard 3 days ago

    > I'll always be skeptical about using AI to amplify AI.

    This project was in part written by Claude, so for better or worse I think we're at least 3 levels deep here (AI-written code which directs an AI to direct other AIs to write code).

  • Balinares 3 days ago

    I think I'm more optimistic about this than brute-forcing model training with ever larger datasets, myself. Here's why.

    Most models I've benchmarked, even the expensive proprietary models, tend to lose coherence when the context grows beyond a certain size. The thing is, they typically do not need the entire context to perform whatever step of the process is currently going on.

    And there appears to be a lot of experimentation going on along the line of having subagents in charge of curating the long term view of the context to feed more focused work items to other subagents, and I find that genuinely intriguing.

    My hope is that this approach will eventually become refined enough that we'll get dependable capability out of cheap open weight models. That might come in darn handy, depending on the blast radius of the bubble burst.

  • dr_dshiv 3 days ago

    Based on clear, operational definitions, AI is definitely more creative than humans. E.g., can easily produce higher scores on a Torrance test of divergent thinking. Humans may still be more innovative (defined as creativity adopted into larger systems), though that may be changing.

    • hansmayer 3 days ago

      More creative? I've just seen my premium subscription "AI" struggling to find a trivial issue of a missing import in a very small / toy project. Maybe these tools are getting all sorts of scores on all sorts of benchmarks, I dont doubt it, but why are there no significant real-world results after more than 3 years of hype? It reminds of that situation when the geniuses at Google offered the job to the guy who created Homebrew and then rejected him after he supposedly did not do well on one of those algorithmic tasks (inverting a binary tree? - not sure if I remember correctly). There are also all sorts of people scoring super high on various IQ tests, but what counts, with humans as with the supposed AI is the real world results. Benchmarks without results do not mean anything.

    • vachina 3 days ago

      It is as creative as it's training material.

      You think it is creative because you lack the knowledge of what it has learnt.

    • qlm 3 days ago

      This is absurd to the point of being comical. Do you really believe that?

      If an “objective” test purports to show that AI is more creative than humans then I’m sorry but the test is deeply flawed. I don’t even need to look at the methodology to confidently state that.

      • yunnpp 3 days ago

        His comment must be fueled by his own lack of creativity. He has engulfed himself in the AI, and his own knowledge gap prevents him from even scratching the surface of his own stupidity.

        • dr_dshiv 2 days ago

          That’s pretty rude. And wrong.

tcdent 3 days ago

> Never lose context again. Amplifier automatically exports your entire conversation before compaction, preserving all the details that would otherwise be lost. When Claude Code compacts your conversation to stay within token limits, you can instantly restore the full history.

If this is restoring the entire context (and looking at the source code, it seems like it is just reloading the entire context) how does this not result in an infinite compaction loop?

  • redhale 2 days ago

    I think the idea would be that you could re-compact with a different focus. When you compact, you can give Claude instructions on what is important to retain and what can be discarded. If you later discover that actually you wanted something you discarded during a previous compaction, this could allow you to recover it.

    Also, it can be useful to compact before it is strictly necessary to compact (before you are at max context length). So there could be a case where you decide you need to "undo" one of these types of early compactions for some reason.

paradox921 18 hours ago

Hi all, I'm the primary author/lead on the "research exploration" that is Amplifier at Microsoft. It's still SUPER early and we're running fast and applying learnings from the past couple of years in new ways to explore some new value we're finding early evidence of. I apologize that the repo is in a very rough condition, we're running very fast and most of what is in there now has been very helpful but will very soon be completely replaced with our next major iteration of it as we continue to run ahead. I did want to take a pause today and put together a blog post to capture a little more context for those of you here who are following along:

https://paradox921.medium.com/amplifier-notes-from-an-experi...

For those who find it useful in this very early stage, to find some value for yourself in either using it or learning from it, happy to be on the journey together. For those who don't like it or don't understand why or what we're doing, I apologize again, it's definitely not for everyone at this stage, if ever, so no offense taken.

willahmad 3 days ago

Project looks interesting, but no demos. As much I want to try it because of all cool concepts mentioned, but I am not sure I want to invest my time if I don't see any demos

  • fishmicrowaver 3 days ago

    I mean that's fair but doing a make install and providing your API key is pretty easy?

    • willahmad 2 days ago

      multiply it by 20 other similar projects and assume 20% have security issues, your environment will be messed up before you even understand if you need it or not. Not even talking about time you lost

lordofgibbons 3 days ago

There are hundreds of these on github. Why should we care? Why not release any benchmarks or examples?

estimator7292 3 days ago

The very first line in the readme is a quote, attributed to "the problem we're solving".

That's cute

  • nvader 3 days ago

    If you think about it, that's because "the problem we're solving" is running out of time. Once it's solved it won't be able to try out ideas.

chews 3 days ago

Billions in investment into OpenAI and this is a wrapper for Claude API usage. This is very much a microsoft product.

hansmayer 3 days ago

>"Amplifier is a complete development environment that takes AI coding assistants and supercharges them with discovered patterns, specialized expertise, and powerful automation — turning a helpful assistant into a force multiplier that can deliver complex solutions with minimal hand-holding."

Again this "supercharging" nonsense? Maybe in Satiyas confabulated AI-powered universe, but not in the real world I am afraid...

nopelynopington 3 days ago

I was hoping this was going to be an awesome new music player, but no, everything new thing is AI now. Welcome to the future

zb3 3 days ago

README files in the "ai_context" directory provide the ultimate AI Slop reading experience..

  • qsort 3 days ago

    Yeah, I'm not even that opposed to using AI for documentation if it helps, but everything from Microsoft recently has been full-on slop. It's almost like they're trying to make sure you can't miss it's AI generated.

    • rectang 3 days ago

      "Eat your own dog slop" isn't bad practice, though.

      Some people in the organization will experience the limitations and some will learn — although there are bound to be people elsewhere in the organization who have a vested interest in not learning anything and pushing the product regardless.

rs186 3 days ago

[flagged]

firemelt 3 days ago

[flagged]

  • nine_k 3 days ago

    Sorry, is this Hacker News? This kind of project is exactly what I'd expect hackers to create. Not using AI in boring limited practical ways where it's known to somehow work, but supercharging AI with AI with AI... etc, and seeing what happens!

  • PantaloonFlames 3 days ago

    Sounds like a research project, they're sharing it out to get some feedback and get a discussion going.

    How is this different than Google's Jules thing? Both sort of experimental exploratory things.

    • vachina 3 days ago

      Why are you doing free research work for a profit making entity. Are you paid for it.

  • hansmayer 3 days ago

    Well, stop asking silly questions. How will the execs get their bonuses, if it turns out we fucked up the web search and invested an equivalent of a moonbase in ... well, I hate to use the phrase, but statistical parrot ?

  • alganet 3 days ago

    That's essentially what a CI environment does. "Multiple tabs" and "swarms". This part should feel familiar to any developer. Having multiple things running in the background to help you is not a new concept and we've been doing it for decades.

    Whether these new helpers that explore ideas on their own are helpful or not, and for which cases, is another discussion.

nba456_ 3 days ago

[flagged]

  • rectang 3 days ago

    From the Hacker News Guidelines:

    "Please don't post comments saying that HN is turning into Reddit. It's a semi-noob illusion, as old as the hills."

    https://news.ycombinator.com/newsguidelines.html

    • SilverElfin 3 days ago

      It seems like this discussion is full of shallow dismissals and smug takes though. See the sister comment to yours.

      • rectang 3 days ago

        In my experience, those get downvoted or flagged (they also aren't in line with the guidelines). Let alone shallow dismissals, even good Reddit-esque short, pithy jokes often get downvoted, because a discussion thread where everybody's just trying to be funny doesn't tend to lead to where most HN participants want to go.

        Talking about downvoting also violates the guidelines (meta-discussions are boring and repetitive), so this comment could arguably succumb, haha. If so, it won't be the first or the last time a comment of mine gets poorly received!

  • kkotak 3 days ago

    I see you're being down voted, Reddit style. But you're on the mark about the hate tone of comments. If you don't like Amplifier, don't use it. No need to spew hate.

ridruejo 3 days ago

[flagged]

  • shermantanktop 3 days ago

    No it doesn’t. It’s dead easy to get a decent level, and going further requires individual effort and skill—-just like any other field of endeavor.

    Gatekeepers who claim otherwise have something to sell.

    • ridruejo 3 days ago

      How can it be gatekeeping when they are literally making it easier to use? The analogy is probably closer to a Linux distro. You can put everything together yourself but if someone gives you a pre integrated environment with best practices it makes it easier to get started

      • hansmayer 3 days ago

        How on earth does this compare to a Linux distro? What are you even talking about here?

        • ridruejo 3 days ago

          I see a Linux distro as a collection of libraries that someone puts together following best practices and conventions (ie all config files go into /etc). The similarity with this project is that Microsoft has taken a collection of tools and best practices and put them together in an easy to install package

          • shermantanktop 3 days ago

            I find that analogy weak.

            Picking a Linux distribution is a commitment, and if I want to change it out I have a lot of work to do, much of which is unwinding my own work that was distro dependent.

            Changing out an agent setup is as easy as installing an IDE, and if I don’t like it I can go back easily. My work is not dependent on the setup - the value I get is transactional, and the quirks of each model or agent approach are not difficult to learn or live without.

            All of which suggests that selling ease of use to someone like me will be pointless. I’m sure there are clueless F500 managers out there who might go for it. But a business model based on selling to people who don’t know anything isn’t very durable.

            • ridruejo 3 days ago

              You may not be the target user for this project then and that’s fine! They are releasing this as a research project so a business model was not probably one of the key decision points.

          • hansmayer 3 days ago

            Ah yes an easy to install package, totally the hallmark of your average Linux distro :)

            • ridruejo 3 days ago

              Not sure if you are being serious or not. That was indeed the point of the very first Linux distros and why most people use them nowadays vs the alternative.

              I started using Linux before there were distros (circa 1993) and it was not a pleasant experience compared to when Slackware came out

  • rs186 3 days ago

    > A lot of developers either don’t use coding agents to their full potential

    Define "full potential".

    Sounds like you are just making things up to sell your product.

    • ridruejo 3 days ago

      Our “product” is a tool we developed internally and found it so useful that decided to open source it.

      With full potential I refer to getting the best possible results. For example, being able to work on tasks in parallel without Claude instances interfering with each other vs , well, no doing so.

      • majkinetor 3 days ago

        Thanks for making it public. I love the zen architect. Will definitely try it.

        • ridruejo 3 days ago

          We do Rover, which is different from the Microsoft product but the goal is similar. I was just responding to the above comment. I agree with you, it is a pretty good project and will be taking a look. We are so early there are tons of things to learn and try.

  • quantumwoke 3 days ago

    Common etiquette is to declare your conflicts of interest

    • ridruejo 3 days ago

      Agreed, it is in my bio but I updated the post in any case

      • firemelt 3 days ago

        ah I just realize actually u are rover dev

        I wonder why do you think we need rover? what is the use case? I got confused

        before that we ask ai with a chat feature

        the next we need a multiple swarm of ai why though?

        • ridruejo 3 days ago

          The main use case why we developed Rover internally (and still is) was the ability to run agents in parallel. It allows us to go much faster but requires tooling around it.

          Secondarily it makes it easier for everyone to share those best practices and tooling among us, but is less of an issue because we are a small team

          • shermantanktop 3 days ago

            Fwiw in my company there’s a lot of interest in sharing best practices but it seems the learning is not as portable as hoped. My view is that it’s a personal learning journey and smoothing that journey beyond a certain point turns into spoonfeeding and reduces learning effectiveness significantly. Give a man a fish, and so on.

xorgun 3 days ago

Is this going to be another HN dropbox moment?

bgwalter 3 days ago

Can we get Windows 7 back instead? Nadella rode the cloud wave in an easy upmarket, his "AI" obsession will fail. No one wants this.

The Austrian army already switched to LibreOffice for security reasons, we don't need another spyware and code stealing tool.

  • falcor84 3 days ago

    > No one wants this

    There are many many people who want better AI coding tools, myself included. It might or might not fail, but there is a clear and strong opportunity here, that it would be foolish of any large tech company to not pursue.

  • SilverElfin 3 days ago

    > Nadella rode the cloud wave in an easy upmarket

    I would say it’s more the result of anti competitive bundling of cloud things into existing enterprise contracts rather than the wave. Microsoft is far worse than it ever was in the 90s but there’s no semblance of antitrust action in America.