HBM's headline bandwidth is meaningless without somewhere to route it. A stack of DRAM can expose a very wide interface, but the GPU it feeds is a separate die, and ordinary package substrates cannot carry that many connections at the required pitch. The answer is a silicon interposer.
Micron's grant US10840229B2, "Graphics processing unit and high bandwidth memory integration using integrated interface and silicon interposer" (issued November 17, 2020; MICRON TECHNOLOGY, INC.; CPC H01L 25/18 multi-chip assemblies, H01L 24/16 flip-chip bumps, H01L 23/481 through-substrate vias), claims that integration directly.
Why silicon and not organic substrate? Silicon can be patterned with the same lithography that builds chips, so an interposer can carry wires at micron-scale pitch - dense enough to fan out an HBM stack's full interface to the GPU. That density is the whole point; it is what converts a wide stack into real GB/s at the processor.
Seeing Micron - a memory maker - file on the GPU-plus-interposer system, not just the DRAM die, is telling. The 2020 record shows memory vendors reaching past the chip into the package, claiming the integration layer because that is where their product's value is actually delivered.
This also previews the bottleneck everyone now talks about. The reason advanced-packaging capacity - interposers, the assembly of GPU-plus-HBM modules - gates the AI buildout is right here: the interposer is a manufactured silicon part with its own yield and capacity limits, distinct from the logic and memory it joins.
For the bandwidth math reader, the rule is simple: count the wires the interposer can carry, because that number caps how much of the stacked memory's bandwidth the GPU can ever see. Micron's 2020 grant is a claim on the part that sets that cap.