Half-precision Inference Doubles On-Device Inference Performance

Posted by Marat Dukhan and Frank Barchard, Software Engineers CPUs deliver the widest reach for ML inference and remain the default target for TensorFlow Lite. Consequently, improving CPU inference performance is a top priority, and we are excited to announce that we doubled floating-point inference performance in TensorFlow Lite’s XNNPack…

Continue ReadingHalf-precision Inference Doubles On-Device Inference Performance