Agentic Development Environment by JetBrains

(air.dev)

70 points | by NumerousProcess 15 hours ago ago

63 comments

rfw300 14 hours ago ago
I'd like others' input on this: increasingly, I see Cursor, Jetbrains, etc. moving towards a model of having you manage many agents working on different tasks simultaneously. But in real, production codebases, I've found that even a single agent is faster at generating code than I am at evaluating its fitness and providing design guidance. Adding more agents working on different things would not speed anything up. But perhaps I am just much slower or a poorer multi-tasker than most. Do others find these features more useful?
[-]
- kace91 13 hours ago ago
  I really would like an answer to this.
  My CTO is currently working on the ability to run several dockerised versions of the codebase in parallel for this kind of flow.
  I’m here wondering how anyone could work on several tasks at once at a speed where they can read, review and iterate the output of one LLM in the time it takes for another LLM to spit an answer for a different task.
  Like, are we just asking things as fast as possible and hoping for a good solution unchecked? Are others able to context switch on every prompt without a reduction in quality? Why are people tackling the problem of prompting at scale as if the bottleneck was token output rather than human reading and reasoning?
  If this was a random vibecoding influencer I’d get it, but I see professionals trying this workflow and it makes me wonder what I’m missing.
  [-]
  - c-linkage 13 hours ago ago
    I was going to say that this is how genetic algorithms work, but there is still too much human in the loop.
    Maybe code husbandry?
    [-]
    - digdugdirk 12 hours ago ago
      Code Husbandry is a good term for what I've been thinking about how to implement. I hope you don't mind if I steal it. Think automated "mini agents", each with a defined set of tools and tasks, responding to specific triggers.
      Imagine one agent just does docstrings - on commit, build an AST, branch, write/update comments accordingly, push and create a merge request with a standard report template.
      Each of these mini-agents has a defined scope and operates in its own environment, and can be customized/trained as such. They just run continuously on the codebase based on their rules and triggers.
      The idea is that all these changes bubble up to the developer for approval, just maybe after a few rounds of LLM iteration. The hope is that small models can be leveraged to a higher quality of output and operate in an asynchronous manner.
  - Aeolun 13 hours ago ago
    Hmm, I haven’t managed to make it work yet, and I’ve tried. The best I can manage is three completely separate projects, and they all get only divided attention (which is often good enough these days).
    [-]
    - kace91 12 hours ago ago
      Do you feel you get a faster/better end result than focusing on a single task at a time?
      I can’t help but feel it’s like texting and driving, where people are overvaluing their ability to function with reduced focus. But obviously I have zero data to back that up.
  - stogot 9 hours ago ago
    My assumption lately is that this workflow is literally just “it works, so merge”. Running multiple in parallel does not allownfor inspection of the code just for testing functional requirements at the end
- SatvikBeri 13 hours ago ago
  I usually run one agent at a time in an interactive, pair-programming way. Occasionally (like once a week) I have some task where it makes sense to have one agent run for a long time. Then I'll create a separate jj workspace (equivalent of git worktree) and let it run.
  I would probably never run a second agent unless I expected the task to take at least two hours, any more than that and the cost of multitasking for my brain is greater than any benefit, even when there are things that I could theoretically run in parallel, like several hypotheses for fixing a bug.
  IIRC Thorsten Ball (Writing an Interpreter in Go, lead engineer on Amp) also said something similar in a podcast – he's a single-tasker, despite some of his coworkers preferring fleets of agents.
  [-]
  - cube2222 13 hours ago ago
    Same.
    I've recently described how I vibe-coded a tool to run this single background agent in a docker container in a jj workspace[0] while I work with my foreground agent but... my reviewing throughput is usually saturated by a single agent already, and I barely ever run the second one.
    New tools keep coming up for running fleets of agents, and I see no reason to switch from my single-threaded Claude Code.
    What I would like to see instead, are efforts on making the reviewing step faster. The Amp folks had an interesting preview article on this recently[1]. This is the direction I want tools to be exploring if they want to win me over - help me solve the review bottleneck.
    [0]: https://news.ycombinator.com/item?id=45970668
    [1]: https://ampcode.com/news/review
- tcdent 13 hours ago ago
  Rather than having multiple agents running inside of one IDE window, I structure my codebase in a way that is somewhat siloed to facilitate development by multiple agents. This is an obvious and common pattern when you have a front-end and a back-end. Super easy to just open up those directories of the repository in separate environments and have them work in their own siloed space.
  Then I take it a step further and create core libraries that are structured like standalone packages and are architected like third-party libraries with their own documentation and public API, which gives clear boundaries of responsibility.
  Then the only somewhat manual step you have is to copy/paste the agent's notes of the changes that they made so that dependent systems can integrate them.
  I find this to be way more sustainable than spawning multiple agents on a single codebase and then having to rectify merge conflicts between them as each task is completed; it's not unlike traditional software development where a branch that needs review contains some general functionality that would be beneficial to another branch and then you're left either cherry-picking a commit, sharing it between PRs, or lumping your PRs together.
  Depending on the project I might have 6-10 IDE sessions. Each agent has its own history then and anything to do with running test harnesses or CLI interactions gets managed on that instance as well.
- hmokiguess 12 hours ago ago
  I think this is the UX challenge of this era. How to design a piece of software that aids in promoting the human-level of attention to a distributed state without causing information loss or cognitive decline over many tasks. I agree that for any larger piece of work with significant scope the overhead of ingesting the context into your brain offsets the time saving costs you get from multitask promises.
  My take on this is that the better these things get eventually we will be able to infer and quantify signals that provide high confidence scores for us to conduct a better review that requires a shorter decision path. This is akin to how compilers, parsers, linters, can give you some level of safety without strong guarantees but are often "good enough" to pass a smell test.
- coffeefirst 13 hours ago ago
  No... I've found the opposite where using the fastest model to do the smallest pieces is useful and anything where I have to wait 2m for a wrong answer is just on the way.
  There's pretty much no way anyone context switching that fast is paying a lick of attention. They may be having fun, like scrolling tiktok or playing a videogame just piling on stimuli, but I don't believe they're getting anything done. It's plausible they're smarter than me, it is not plausible they have a totally different kind of brain chemistry.
- faizshah 13 hours ago ago
  The parallel agent model is better for when you know the high level task you want to accomplish but the coding might take a long time. You can split it up in your head “we need to add this api to the api spec” “we need to add this thing to the controller layer” etc. and then you use parallel agents to edit just the specific files you’re working on.
  So instead of interactively making one agent do a large task you make small agents do the coding while you focus on the design.
- tortilla 13 hours ago ago
  My context window is small. It's hard enough keeping track of one timeline, I just don't see the appeal in running multiple agents. I can't really keep up.
  [-]
  - Protato85 13 hours ago ago
    For some things its helpful, like have one agent plan changes / get exact file paths, another agent implement changes, another agent review the PR, etc. The context window being small is the point I think. Chaining agents lets you break up the work, and also give different agents different toolsets so they aren't all taking a ton of MCPs / Claude Skills into context at once.
- jasonsb 13 hours ago ago
  I'm with you. The industry has pivoted from building tools that help you code to selling the fantasy that you won't have to. They don't care about the reality of the review bottleneck; they care about shipping features that look like 'the future' to sell more seats.
- NumerousProcess 13 hours ago ago
  I have to agree, currently it doesn't look that innovative. I would rather want parallel agents working on the same task, orchestrated in some way to get the best result possible. Perhaps using IntelliJ for code insights, validation, refactoring, debugging, etc.
- dwb 13 hours ago ago
  Completely agree. The review burden and context switching I need to do from even having two running at once is too much, and using one is already pretty good (except when it’s not).
- nurettin 7 hours ago ago
  Even with the best agent in plan mode, there can be communication problems, style mismatches, untested code, incorrect assumptions and code that is not DRY.
  I prefer to use a single agent without pauses and catch errors in real time.
  Multiple agent people must be using pauses, switching between agents and checking every result.
- torginus 12 hours ago ago
  I think the problem is that current AI models are slow to generate tpkens so the obvious solution is 'parallelism'. If they could poop out pages of code instantly, nobody would think about parallel agents.
  I wish we'll get a model that's not necessarily intelligent, but at least competent at following instructions and is very fast.
  I overwhelmingly prefer the workflow where I have an idea for a change and the AI implements it (or pushes back, or does it in an unexpected way) - that way I still have a general idea of what's going on with the code.
- ivape 13 hours ago ago
  Right. A computer can make more code than a human can review. So, forget about the universe where you ever review code. You have to shift to almost a QA person and ignore all code and just validate the output. When it is suggested that you as a programmer will disappear, this is what they mean.
  [-]
  - kace91 13 hours ago ago
    >You have to shift to almost a QA person and ignore all code and just validate the output.
    The obvious answer to this is that it is not feasible to retry each past validation for each new change, which is why we have testing in the first place. Then you’re back at square one because your test writing ability limits your output.
    Unless you plan on also vivecoding the tests and treating the whole job as a black box, in which case we might as well just head for the bunkers.
    [-]
    - ivape 11 hours ago ago
      "... treating the whole job as a black box"
      Yes, that is exactly what I mean. You ask the Wizard of Oz for something, and you hear some sounds behind the curtain, and you get something back. Validate that, and if necessary, ask Oz to try again.
      "The obvious answer to this is that it is not feasible to retry each past validation for each new change"
      It is reasonably feasible because the job of Production Development and QA has existed, developers just sat in the middle. Now we remove the developer, and move them over to the role of combined Product + QA, and all Product + QA was ever able to even validate was developer output (which, as far as they were ever concerned, was an actual black box since they don't know how to program).
      The developer disappears when they are made to disappear or decide to disappear. If the developer begins articulating ideas in language like a product developer, and then validates like a QA engineer, then the developer has "decided" to disappear. Other developers will be told to disappear.
      The existential threat to the developer is not when the company mandate comes down that you are to be a "Prompt Engineer" now, it is when the mandate comes down that you need to be a Product Designer now (as in, you mandated not to write a single. line. of. code.) . In which case vast swaths of developers will not cut it on a pure talent level.
      [-]
      - kace91 5 hours ago ago
        You haven’t addressed the original question. The point is not whether the QA understands the codebase, but whether the QA understands its own test system.
        If yes, the QA is manuallish (considering manual == no automate by AI) and we’re still bottlenecked, so speeding up the engineer was a loss for nothing.
        If no, because QA is also AI, then you have a product with no humans eyes on it being tested by another system with no human eyes of it. So effectively nobody knows what it does.
        If you think LLMs are anywhere near that level of trust I don’t know what you’re smoking. They’re still doing things like “fixing” tests by removing relevant non passing cases every day.
minus7 13 hours ago ago
JetBrains should stop building stupid AI shit and fix their IDEs. 2025 versions are bordering on unusable.
[-]
- minus7 13 hours ago ago
  Issues I observed, mostly using GoLand:
  - syntax errors displaying persistently even after being fixed (frequently; until restarted; not seen very recently)
  - files/file tree not detecting changes to files on disk (frequent; until restarted; not seen very recently)
  - cursor teleporting to specific place on the screen when ctrl is pressed (occasionally; until restarted)
  - and most recently: it not accepting any mouse/keyboard input (occasionally; until killed))
  [-]
  - irilesscent 10 hours ago ago
    Have you made a bug report?
- dzhiurgis 9 hours ago ago
  Not iterating on AI is almost certainly suicidal.
amateurhuman 13 hours ago ago
Looks like we know where they got the domain from.
https://news.ycombinator.com/item?id=44043231
gavinray 14 hours ago ago
Not to rain on their parade, but I do find it at least a little bit funny that Kotlin Multiplatform is JetBrains's prerogative and the app is Mac only, lol...
[-]
- buster 14 hours ago ago
  It's a preview, isn't it? The pages says win, mac, Linux.
  [-]
  - davey48016 14 hours ago ago
    Yeah. Windows, Linux, and Web are listed under "What's Coming"
- gfody 14 hours ago ago
  market research shows that 100% of the people interested in this style of development are mac users
piker 14 hours ago ago
The litmus test for the utility of this kind of thing is does JetBrains prefer to use Air to develop Air--i.e., is it self-hosting?
hmokiguess 13 hours ago ago
I really like this initiative, I think the biggest value here isn't the multiple sessions or worktrees, but an interoperable protocol between these coding agents through a new UX. A sort of parent process orchestrator of the many agents is something I want, is there other tools that do that today? e.g. run Claude, Codex, Gemini, all together and sharing data with one another?
[-]
- jmalicki 13 hours ago ago
  Something like Shrimp is useful for at least coordinating different Claude subagents.
BrandonSmith 13 hours ago ago
Seems this is the product JetBrains mentioned in their sun-setting announcement of the short-lived CodeCanvas product.
https://blog.jetbrains.com/codecanvas/2025/10/jetbrains-is-s...
otekengineering 13 hours ago ago
Looks similar to Omnispect
https://omnispect.dev
[-]
- torginus 12 hours ago ago
  So cool to see IMGUI in use!
cheptsov 14 hours ago ago
Finally a step in the right direction. This brings the best of two worlds: the lightweightness of Fleet and agents battle-tested with Junie/IntelliJ.
Congrats to the team. Can’t wait to try it.
onionisafruit 13 hours ago ago
I've been drifting from jetbrains to zed lately, but this is making it difficult. I can probably do something similar in zed, but I don't know how.
[-]
- tecoholic 13 hours ago ago
  I think it’s just multiple git work trees and multiple zed windows.
bjacobso 13 hours ago ago
Seems like the best competitor to Conductor at the moment. They did a great job.
chuckadams 13 hours ago ago
So do we actually get to edit any of the AI code additions or changes or is this just "PR merge hell mode" in Project Manager Simulator? Yes, I could flip over to my editor, but that kind of misses the whole point of the 'I' in "IDE".
I'm team JetBrains4Life when it comes to IDEs, but their AI offerings have been a pretty mixed bag of mixed messages. And this one requires a separate subscription at that when I'm already paying for their own AI product.
faizshah 13 hours ago ago
Not to be overly negative but I’m kinda disappointed with this and I have been a JetBrains shill for many years.
I already use this workflow myself, just multiple terminals with Claude on different directories. There’s like 100 of these “Claude with worktrees in parallel” UIs now, would have expected some of the common jetbrains value adds like some deep debugger integration or some fancy test runner view etc. The only one I see called out is Local History and I don’t see any fancy diff or find in files deep integration to diff or search between the agent work trees and I don’t see the jetbrains commit, shelf, etc. git integration that we like.
I do like the cursor-like highlight and add to context thing and the kanban board sort of view of the agent statuses, but this is nothing new. I would have expected at the least that jetbrains would provide some fancier UI that lets you select which directories or scopes should be auto approved for edit or other fancy fine grained auto-approve permissions for the agent.
In summary it looks like just another parallel Claude UI rather than a Jetbrains take on it. It also seems like it’s a separate IDE rather than built on the IntelliJ platform so they probably won’t turn it into a plugin in the future either.
AJRF 13 hours ago ago
I've just spent the day reading and reviewing the absolute slop that comes out of these things :'(
GiorgioG 14 hours ago ago
Can't wait for this AI shit to be over so they can get back to their bread & butter...great dev tools.
[-]
- NitpickLawyer 14 hours ago ago
  > their bread & butter...great dev tools.
  A cursor style "tab" model, but trained on jetbrains IDEs with full access to their internals, refactoring tools and so on would be interesting to see.
  [-]
  - grim_io 13 hours ago ago
    They have that now. Not as great as cursor tab, but nothing is.
- ElijahLynn 14 hours ago ago
  Umm, it ain't ever gonna be over, it is a new era.
  We need to adapt to new ways of thinking and ways of working with new tooling. It is a learning curve of sorts. What we want is to solve problems, the new tooling enables us to solve problems better by letting us free up our thinking by reducing blockers and toil tasks, giving us more time to think about higher level problems.
  I remember this same sentiment towards AI when I was growing up, but towards cell phones...
  [-]
  - ceejayoz 13 hours ago ago
    > I remember this same sentiment towards AI when I was growing up, but towards cell phones...
    Sure. But the same for NFTs.
    We'll see which one this winds up being.
    [-]
    - stocksinsmocks 13 hours ago ago
      The value of an NFT is the speculation that a bigger fool than you is in the market (and if you’re average, there is).
      The value of AI coding is that it can eliminate some of the labor of programming, which is the overwhelming majority of cost.
      These value propositions are nothing alike.
      [-]
      - ceejayoz 13 hours ago ago
        > The value of an NFT is the speculation that a bigger fool than you is in the market (and if you’re average, there is).
        This describes OpenAI’s valuation pretty well.
  - rileymichael 13 hours ago ago
    > What we want is to solve problems
    speak for yourself, i want to understand everything and be elbow deep in the code
    [-]
    - ElijahLynn 13 hours ago ago
      I will empathize with you there. I totally want to understand everything too. I LOVE being elbow deep in code for hours on end, especially late nights, so, much, FUN!!!
      It is just now, I don't have to do that to actually build something meaningful, my ability to build is increased by some factor, and it is only increasing.
      And coding LLM's have become a great teacher for me, and I learn much faster, for when I do want to dig deeper into the code, I can ask very nuanced questions about what certain code is doing, or how it works and it does a fairly good job of explaining it. Similar to how a real person would if I were in meat space at an office. Which I don't get that opportunity anymore in this remote life.
      [-]
      - dugidugout 12 hours ago ago
        If you were sincere in your attempt to "empathize with [them] there", your prose screams the opposite. I point this out, as anecdotally, it was quite distracting from the rest of your point and makes me think you are not doing much to meet the other perspective.
        Now to directly push on your perspective, I'm not so sure why you make the conclusion that you don't have opportunity for feedback given you've moved to a remote office culture. I am giving you a form of feedback in this instance. Yes it is at my whim and not guaranteed if our interests don't align, however this is a cost of collaboration. It is a bit grim to see the ushering of "coding LLM" as proper replacement here, when you are doing no-more than bootstrapping introspection. This isn't to detract from the value you've found in the tool, I only question why you've written off the collaboration element of unique human experiences interlocking on common ground.
    - dwb 13 hours ago ago
      Capital has other ideas, it wants “problems” “solved” faster and faster.
  - Sincere6066 13 hours ago ago
    bookmarking this to laugh at it in 2030
- ElijahLynn 14 hours ago ago
  Umm, it ain't ever gonna be over, it is a new era.
  We need to adapt to new ways of thinking and ways of working with new tooling. It is a learning curve of sorts. What we want is to solve problems, the new tooling enables us to solve problems better by letting us free up our thinking by reducing blockers and toil tasks, giving us more time to think about higher level problems.
- ur-whale 12 hours ago ago
  > Can't wait for this <new technology> shit to be over
  Said the assembly senior specialist when first confronted with this newfangled fortran compiler shit.
  [-]
  - GiorgioG 12 hours ago ago
    LLMs are nothing more than fancy weather forecasting models…they still get things wrong a lot.
  - GiorgioG 12 hours ago ago
    The Fortran compiler worked though.
matt3210 13 hours ago ago
Ooof I forgot to cancel my jetbrains all products license when I switched to vs code. I better go do that now before it renews. Not because of AI but it also doesn’t help