Blade Components
A third axis, like the dispatcher: not a per-model trait but the Blade render path. The traits make models faster; this makes the views every page renders faster — byte-for-byte the same HTML.
The benchmark
The stress test is rendering 1,000 anonymous components — the standard way to isolate Blade's per-component overhead, which runs about 14 ms for that many on a typical laptop. The question Grease asks of it: can that number come down reliably?
"Reliably" is the whole game: it's easy to shave a render by changing what it emits; the bar here is the same as everywhere else in Grease — byte-identical output, asserted before any number is timed. And the honest framing up front: this isn't a Blade overhaul. It's a pile of marginal, byte-identical wins — the same shape as the rest of the package — each one a thing core would wave off in isolation, that compounds once they're all on the page.
Three singletons, three surfaces
The provider swaps three bound singletons for greased, behaviour-identical drop-ins (each built via a fromBase() reflection-clone that carries the existing instance's full state, so the swap is transparent):
blade.compiler→Grease\View\Compiler— the compile-time emit and a couple of per-render lookups the compiler owns.view→Grease\View\Factory— the runtime render bookkeeping the view factory drives (@foreach's$loop,@yield's content stitching).view.finder→Grease\View\FileViewFinder— view resolution: when a freshgrease:view-cacheindex exists, a view name resolves to its file via an array hit instead of a filesystem stat-walk (the provider also registers that command).
What it does
The wins ride on those three singletons. Each replaces work every page repeats — work that doesn't depend on your data — with a tighter path that emits the identical bytes.
On the compiler (blade.compiler):
@propsresolution. Vanilla compiles@props([...])to a block that, on each render, rebuilds a flat name list, scans it within_array(a linear walk per attribute), evaluates the declaration array twice, and snapshots the whole scope withget_defined_vars()to clean up. Grease compiles it to one memoizedProps::mergeAttributes()call (the name set is built once per@propssite, not per render) followed by a tight$$key = $valuebind loop.$attributes->merge([...]). Nearly every component template calls it. Vanilla runs it through the Collection pipeline —new Collection,partition(),mapWithKeys(),->merge(),->all(), roughly five allocations a render. Grease does the identical partition + append in two plainforeachloops, no Collections. It's a subclass of the framework's ownComponentAttributeBag(so everyinstanceofcheck still holds), andmerge()returnsnew static, so the fast path stays live down any chain.- Greased bag for class components, too. The greased bag reaches
@propscomponents for free (viaProps::mergeAttributes), but plain class and no-@propscomponents build a vanilla bag throughComponent::newAttributeBag()— so their template merge took the slow Collection path. The catch: the opening boilerplate hands the template its bag (viadata()insidestartComponent) beforewithAttributespopulates it, so you can't reclass it after. The fix is one emitted line — Grease overrides thecompileClassComponentOpeningemit to seed$component->attributes ??= new \Grease\View\ComponentAttributeBag([])beforestartComponent, whichdata()'s?:then adopts. Byte-identical (an empty seed equals vanilla's lazynewAttributeBag(); the??=preserves a constructor-set bag). getCompiledPath()memoization. The compiled-file path is a purehash('xxh128', …)of the view path — and it's recomputed on every view render (CompilerEngine::get), which on a real page is a tree of renders (every component, slot,@include,@eachis one). The framework already memoizes this lookup's siblings (normalizeName,getEngineFromPath) but missed this one. Grease memoizes it keyed by path. It scales with render count, so it helps realistic pages most — the opposite of the@props/mergedilution below.
On the view factory (view):
@foreach's$loopbookkeeping. Blade emits full$loopstate for every@foreach, even when the body never touches$loop. The machinery (ManagesLoops—addLoop/incrementLoopIndices/getLastLoop) is ~35% of a loop-heavy render:incrementLoopIndicesarray_merges the 10-key state array every iteration, and the others reach the stack top throughArr::last(closure-default overhead). Grease updates the state in place by reference (no merge) and uses a direct stack index. Byte-identical — same loop-state shape, andgetLastLoopkeeps its fresh(object)cast every call (a micro proved reusing one object is both unsafe and ~10% slower, so there's no tension).@yield/yieldContent.@yield('content')hands the whole page body toyieldContent, which vanilla scans three times withstr_replace(@@parent→ marker, strip the placeholder, marker →@parent) — 29% of a layout render's self-time. The net is one substitution over three non-overlapping markers, and neither pass re-scans its own output, so Grease collapses all three into onepreg_replace_callbackover the alternation. Byte-identical, proven against a verbatim three-pass oracle.@push/@prependstack assembly. A component (or a loop row) that pushes its assets to a@stackrunsstopPush()/stopPrepend()once per@endpush/@endprepend, and vanilla wraps each pop intap(array_pop(…), fn ($last) => …)— allocating a fresh bound closure every call (the profile puts the two stacktapclosures at ~13% of a push-heavy render's self-time) for a return value the compiled directive discards. Grease inlines it: the same pop, the sameextendPush($last, ob_get_clean()), the same returned section name — no closure.
Byte-identical, by test
Same bar as the model tiers: the rendered HTML is identical to the byte. The macro (benchmarks/blade.php) now has nine parity-gated variants — page-simple, page-foreach, page-rich, page-rich-foreach, page-app, page-table, page-layout, page-stacks, page-full — each rendered through the stock compiler and the greased one in two booted apps with separate compiled-view caches, and each asserts the HTML matches before it times anything. The variety is the point: the right fixture surfaces the lever (page-table is what surfaced the @foreach cost; page-layout surfaced @yield). Execution-level tests pin every piece down — the @props emit resolves the same prop variables and leaves the same attributes (down to the inaccessible kebab-alias local vanilla also creates); the greased merge() matches across class/style appends, escaping, AppendableAttributeValue, and ordering; the loop tier matches across countable / single / non-countable-generator / nested-with-parent; and yieldContent is checked against its three-pass oracle across plain, marker, adjacency, and pathological inputs.
What it's worth
On the parity-gated macro variants (Linux, benchmarks/docker, output asserted identical):
| Variant | What it stresses | Δ |
|---|---|---|
| simple | initials + one attribute merge | −30.0% |
| rich | 5 props, a @php block, conditionals, slots | −22.7% |
| app page | class components, slots, @include/@each, a view composer | −16.2% |
| data table | nested @foreach, heavy $loop use | −27.8% |
| layout | @extends / @section / @yield / @push | −15.5% |
| asset stacks | @push/@prepend per row into a @stack | −11.1% |
| full page | extends a layout, 5 sections, a 100-row @foreach table, components | −11.5% |
The @foreach variants (page-foreach, page-rich-foreach) render the avatar benchmark in the realistic loop form, and land on the same ~−32% / ~−24% as their @for counterparts — confirming the tiers compose with zero regression.
The full page is the most honest single number: a standard page that @extends a primary layout, fills head/styles/scripts/footer/content (with a @parent override), renders a 100-row @foreach table, and drops in a few components — every tier firing at once. It lands at −11.5%, lower than any single-axis variant, and that's the truth of the regime: on a realistic page the greasable framework slice is small because genuine work dominates. A profile (Excimer) puts ~53% of the render in the compiled template bodies (your markup) and ~24% in e()/htmlspecialchars (escaping ~500 table cells) — both genuine, both off-limits. The remaining ~23% is spread thin across the engine and Factory, and every greasable piece of it is already greased. So the single-axis numbers above are what each tier is worth where it dominates; −11.5% is what they compound to where they don't.
The regime insight — which fixture wins on which tier
The wins split by what the loop body costs, and that's worth knowing when you reason about your own pages. Loop / $loop greasing pays when the body is cheap — data-table cells, lists, menus — where the per-iteration machinery (~0.1 µs) is a real fraction of each pass; on a component loop the body (~15 µs an iteration) dwarfs it, so component greasing (@props + merge) is what moves the needle there instead. Neither tier regresses the other — they just light up on different shapes. So a component-dense page leans on the compiler tier and a table-dense page leans on the factory tier, and a page with both gets both.
Not a halving, and worth saying plainly. The rest of a render is the compiled template executing (real work — your markup, the Str calls) plus the per-component resolution machinery. The one big remaining slice — the $__componentOriginal save/restore boilerplate the ComponentTagCompiler emits around every <x-…> — we did take a hard look at, and measured it a dead end: its expensive statements (resolve/data/renderComponent) are user-class or genuine assembly, the save/restore guards are load-bearing for nesting and loop correctness, and the one removable redundancy (a doubled isset … instanceof predicate) regressed ~1% when hoisted. We've measured several other levers and rejected them too — replacing extract() with a bind loop (a regression: extract is a C builtin, ~2× faster than userland), caching is_file() (a macOS profiling artifact that's ~3% on Linux, and the wrong thing on principle), a gatherData Renderable scan (~0.1% safe, with order and side-effect parity walls). Those dead ends are recorded openly in the repo's NOTES, so we don't recircle them.
So: two years on, what we'll stand behind is reliably a tenth to two-fifths off, byte-for-byte, depending on the page's shape — and the harness is right there if you want to chase the rest.
Opt in
It is not auto-discovered — register the provider explicitly:
// bootstrap/providers.php, or the providers array in config/app.php
Grease\View\GreaseViewServiceProvider::class,It extends three bound singletons — blade.compiler, view, and (when a fresh grease:view-cache index exists) view.finder — swapping each for its greased counterpart via fromBase(), so every directive, component, condition, and shared value already registered (or registered afterwards) is carried over. Register it early so the compiler is greased before any view compiles; existing views recompile on their next change, and a php artisan view:clear forces it immediately. Output stays byte-identical.
Profiling Blade honestly
If you go chasing the rest, profile with a sampling profiler (the repo ships an Excimer harness, benchmarks/blade_excimer.php), not Xdebug — Xdebug disables JIT and mis-attributes internal-op cost, which sent us down a dead end (it ranked extract at ~14% of a render when it's ~0.6%). See Benchmarks for the method.