Senior Embedded Performance Engineer
Bristol or London, 3 days in the office, 2 days WFH
At Fractile, we’re building what we believe will be the world’s fastest AI inference chip from the ground up. We’re balanced across hardware and software engineering, and HW/SW co-design is real here. We move fast, and we help each other move fast. We care about each other, the software we ship, and the people who rely on it.
On the device, close to the metal, we write the runtime software that orchestrates work across the chip and runs performance-critical ML kernels. This is where performance gets real and the wins compound. Your work directly influences trade-offs for the silicon, system deployment, and the compiler.
You'll drive the first accelerator compute runs, evaluating performance on silicon, running early benchmarks, and feeding results back into the hardware and software roadmap.
What you’ll do
Write and optimise performance-critical ML kernels in C, with assembly where it matters (RISC-V and our own ISA)
Build the low-level control paths that feed those kernels, including scheduling, synchronisation, and data movement
Write targeted validation workloads and microbenchmarks to keep simulation and hardware behaviour aligned and performance measurable.
Profile, benchmark, and track regressions so performance improvements are real and repeatable
Work closely with simulation, hardware, ML, compiler, firmware, and runtime engineers in a tight loop, turning profiling data into architecture feedback and real performance wins.
What we’re looking for
Proven deeply embedded software experience
Strong performance instincts. You can reason about low-level architecture, memory behaviour, and where the cycles are spent
Excellent C, and a pragmatic approach to building high-quality, maintainable low-level code
Comfortable writing and debugging optimised assembly (RISC-V ideal)
Collaborative and high-ownership. You communicate clearly, move fast, and enjoy working through hard problems with others
Computer Science, Electronic Engineering, Maths, Physics, or related degree and 3+ years of industry experience
Nice to have
Experience with GPUs or dedicated ML accelerators
Rust and/or Python experience
Experience with simulators (functional or performance) and writing validation or benchmarking workloads
Familiarity with modern ML inference workloads
If you want to build the software that turns cutting-edge hardware capability into real throughput and low latency, come build it with us.
Proven deeply embedded software experience
Strong performance instincts. You can reason about low-level architecture, memory behaviour, and where the cycles are spent
Excellent C, and a pragmatic approach to building high-quality, maintainable low-level code
Comfortable writing and debugging optimised assembly (RISC-V ideal)
Collaborative and high-ownership. You communicate clearly, move fast, and enjoy working through hard problems with others
Computer Science, Electronic Engineering, Maths, Physics, or related degree and 3+ years of industry experience
Nice to have
Experience with GPUs or dedicated ML accelerators
Rust and/or Python experience
Experience with simulators (functional or performance) and writing validation or benchmarking workloads
Familiarity with modern ML inference workloads
If you want to build the software that turns cutting-edge hardware capability into real throughput and low latency, come build it with us.
Apply with uptayn.
Sign in free to open the apply link, get this role scored against your CV, and track your application.