embedding-shape

It's an interesting idea, but I feel like it's missing almost the most important thing; the context of the change itself. When I review a change, it's almost never just about the actual code changes, but reviewing it in the context of what was initially asked, and how it relates to that.

Your solution here seems to exclusively surface "what" changes, but it's impossible for me to know if it's right or not, unless I also see the "how" first and/or together with the change itself. So the same problem remains, except instead of reviewing in git/GitHub/gerrit + figure out the documents/resources that lays out the task itself, I still have to switch and confirm things between the two.

show comments
Peritract

> more and more engineers are merging changes that they don't really understand

You cannot solve this problem by adding more AI on top. If lack of understanding is the problem, moving people even further away will only worsen the situation.

show comments
dawnerd

If I'm reviewing AI code, I don't want AI summaries. I want to be able to read the code and understand what it does. If I can't do that, the code the AI output isn't very good. In theory, your AI changes should be smaller chunks just like a real developer would do.

show comments
superfrank

Maybe I'm missing something obvious, but if I was going to have my team use this, I'd want someone to answer the following question

If AI is good enough to explain what the change is and call out what to focus on in the review, then why isn't AI good enough to just do the review itself?

I understand that the goal of this is to ensure there's still a human in the review cycle, but the problem I see is that suggestions will quickly turn into todo lists. Devs will read the summary, look at the what to review section, and stop reviewing code outside of things called out in the what to focus on section. If that's true, it means customers need to be able to trust that the AI has enough context to generate accurate summaries and suggestions. If the AI is able to generate accurate summaries and suggestions, then why can't we trust it to just do the review itself?

I'm not saying that to shit on the product, because I do get the logic behind it, but I think that's a question you should have a prepared answer for since I feel like I can't be the only one thinking that.

show comments
jFriedensreich

Looks kind of neat like devon.ai review / reviewstack crossover. But as i tell every of the dozens projects trying to make a commercial review tool: i would rather spend a week vibe copying this than onboarding a tool i have to pay for and am at the mercy of whoever made it. Its just over for selling saas tools like this. For agents i also need this local not on someones cloud. Its just a matter of time until someone does it.

show comments
gracealwan

Totally different part of the reviewing experience, but I would love to see PR comments (or any revisions really) be automatically synced back to the context coding agents have about a codebase or engineer. There’s no reason nowadays for an engineer or a team of engineers to make the same code quality mistake twice. We manually maintain our agents.md with codebase conventions, etc, but it’d be great not to have to do that.

show comments
tasuki

> Stage automatically analyzes the diff, clusters related changes, and generates chapters.

Isn't that what commits are for? I see no reason for adding this as an after-thought. If the committers (whether human or LLM) are well-behaved, this info is already available in the PR.

show comments
christiastoria

Pretty neat, sick of trying to digest 100 Devin comments at once!

high_priest

No pricing page, you've lost my interest. Doesn't matter that there is an obscured quote on the front page. Be up front about the costs.

show comments
baldai

I was actually recently thinking about similar idea. I am someone who started coding post LLMs and have basic technical understanding. I know what loops, variables, API, backend bla bla is. I learned bunch more since then but I am not capable of making decisions based on git diff alone. And I want to. I want to because I think increasing my skills is still super important, even in AI era. The models are getting better, but are still limited by their core design -- for now it does not seem like they will replace humans.

So getting assistance in the review, in making the decisions and giving me more clarity feels interesting.

Maybe its people like me, who became involved into coding after the LLMs who might be your niche.

One thing I dont understand, the UI/UX? Is this visible only on git itself? Or can I get it working in Codex?

show comments
forthwall

Interesting app, I have a weird bug I'm seeing with the homepage, when I tab between the chapters, it lags a bit then doesn't actually proceed to the next chapter until I press again

show comments
tfrancisl

Why is this a service and not an open source project? It doesn't seem to do much other than organize your commits within a PR (could be run once on a dev machine and shipped in the code, then displayed separately) and builds a dashboard for PRs that's not too far off from what github already offers, but could also be represented with fairly small structured data and displayed separately.

show comments
konovalov-nk

This is mostly solved just by writing proper commit messages: https://blog.br11k.dev/2026-03-23-code-review-bottleneck-par...

Much more interesting part is how exactly you map Context/Why/Verify to a product spec / acceptance criterions.

And I already posted how to do this. SCIP indexes from product spec -> ACs -> E2E tests -> Evidence Artifacts -> Review (approve/reject, reason) -> if all green then we make a commit that has #context + #why + #verify (I believe this is just points to e2e specs that belong to this AC)

Here's full schema: https://tinyurl.com/4p43v2t2 (-> https://mermaid.ai/live/edit)

What I'm trying to visualize is exactly where the cognitive bottleneck happens. So far I've identified three edges:

1. Spec <-> AC (User can shorten URL -> which ACs make this happen?)

2. AC <-> Plan (POST /urls/new must create new DB record and respond with 200) -> how exactly this code must look like?

3. Plan/Execute/Verify -> given this E2E test, how can I verify that test doing what AC assumes?

The cognitive bottleneck is when we transforming artifacts:

- Real world requirements (user want to use a browser) -> Spec (what exactly matters?)

- Spec -> AC (what exactly scenarios we are supporting?)

And you can see on every step we are "compressing" something ambiguous into something deterministic. That's exactly what is going on in Engineer's head. And so my tooling that I'm gonna release soon is targeted exactly to eliminate parts that can we spend most time on: "figuring out how this file connects to the Spec I have in my head, that I built from poorly described commit messages, outdated documents, Slack threads from 2016, and that guy who seemingly knowed everything before he left the company".

show comments
phyzix5761

This is a really cool idea but where's the moat? What's stopping someone from replicating the functionality?

show comments
namanyayg

Looks amazing. I've been trying different stacking PR tools and Graphite and this looks to be the most human-centric so far. I'll have a shot at using this within our team soon. Congrats on the launch!

show comments
whywhywhywhy

The idea of a workplace where people can’t be bothered to read what the ai is coding but someone else is expected to read and understand if it’s good or slop just doesn’t really add up.

I personally see the value of code review but I promise you the most vocal vibe coders I work with don’t at all and really it feels like something that could be just automated to even me.

The age of someone gatekeeping the codebase and pushing their personal coding style foibles on the rest of the team via reviews doesn’t feels like something that will exist anymore if your ceo is big on vibe coding.

show comments
ryanjso

I like the chapters thing, a lot of PRs I review should really be like 5 prs so its nice to have it auto split like that.

Do you see a world where it splits them up on the git level?

show comments
sscarduzio

We have the same problem, and I came up with this:

https://sscarduzio.github.io/pr-war-stories/

Basically it’s distilling knowledge from pr reviews back into Bugbot fine tuning and CLAUDE.md

So the automatic review catches more, and code assistant produces more aligned code.

show comments
SkyPuncher

Hmm. All of the examples simply describe what the code is doing. I need a tool that explains the intent and context behind a change.

show comments
electrum

Does Stage work for PRs that have multiple commits? These could be considered "stacked diffs", but in the same PR.

show comments
kylestlb

I assume Gitlab/Github will add these sort of features to their products within the next few months

show comments
lisayang888

Really like this idea. But at what point do you think its valuable to have this chapters breakdown versus splitting things up into multiple PRs?

show comments
malcolmgreaves

Y’all are a bit nuts if you want 50% more per month than Claude Pro for this.

syngrog66

easier: dont do vibe coding or allow AI bots

better: break up codebase into areas over which certain engs will "own" code reviews over. divy up burden

best: hire best folks, mentor them

keybored

“Putting the cuisine back in food”

Looks inside.

Now that we are all eating Soylent it can get a little bland sometime. That’s why we are releasing our international, curated spice package for your Soylent...

sebakubisz

Can reviewers adjust the chapter splits manually if they disagree with how it grouped the PR, or are the chapters fixed once generated?

show comments
te_chris

I’ve built this into a cli TUI. Passes the whole diff to Claude code with a schema and gets a structured narrative back out. Works really well for understanding.

Reconstituting messy things is exactly where LLMs can help.

builderminkyu

[flagged]

show comments