TL;DR

Manticore has rebuilt its ONNX processing path, resulting in a 14× speed boost for generating embeddings. This development improves efficiency for AI workloads and highlights ongoing optimization efforts.

Manticore has implemented a major update to its ONNX processing pipeline, resulting in a 14-fold increase in embedding generation speed. This enhancement is confirmed by the company and is expected to significantly improve performance for large-scale AI applications using Manticore’s platform.

The update involved a complete redesign of the ONNX path within Manticore’s architecture. According to Manticore, the new implementation reduces latency and increases throughput, enabling faster embedding computations. The company claims that this improvement will benefit applications in search, recommendation systems, and other AI-driven services that rely heavily on embedding generation. Technical sources indicate that the overhaul included optimizing data flow, reducing bottlenecks, and leveraging more efficient hardware utilization. Manticore has not disclosed specific technical metrics beyond the 14× speed increase but emphasizes that the change is a core part of their ongoing performance enhancement strategy.
At a glance
updateWhen: announced March 2024
The developmentManticore has announced a significant overhaul of its ONNX integration, achieving a 14× increase in embedding generation speed.

Impact on AI Performance and Scalability

This development matters because it directly addresses the computational bottlenecks faced by AI applications that rely on embedding generation. A 14× speed increase means that large datasets can be processed more quickly, reducing latency and operational costs. For users, this translates into faster search results, more responsive recommendation engines, and improved scalability for AI services. It also demonstrates Manticore’s commitment to optimizing its platform to stay competitive in the rapidly evolving AI hardware and software landscape. Industry analysts suggest that such performance improvements could set new standards for open-source AI tools and accelerate adoption in enterprise settings.

Amazon

high performance AI embedding hardware

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background on Manticore and ONNX Optimization Efforts

Manticore is an open-source search and data analysis platform that integrates machine learning models, including embeddings, to enhance data retrieval and processing. Prior to this update, Manticore relied on ONNX (Open Neural Network Exchange) for model interoperability, but its performance was limited by the efficiency of its processing pipeline. Over the past year, there have been multiple industry-wide efforts to optimize ONNX-based workflows, driven by the growing demand for real-time AI inference. Manticore’s recent update aligns with these trends, aiming to improve throughput and reduce latency in embedding computations, which are critical for many AI applications. The company has previously announced incremental improvements, but the latest overhaul marks a significant leap forward.

“The redesign of our ONNX path has allowed us to achieve a 14× increase in embedding speed, marking a major milestone for our platform.”

— Manticore Engineering Team

SG Store 1pc I/O IO Shield Backplate GPU Bracket Graphics Card External Bracket Compatible with EVGA RTX 3060ti 3070 3080 3090 FTW3 XC3 Black Zinc

SG Store 1pc I/O IO Shield Backplate GPU Bracket Graphics Card External Bracket Compatible with EVGA RTX 3060ti 3070 3080 3090 FTW3 XC3 Black Zinc

Compatibility: compatible with EVGA RTX 3060ti 3070 3080 3090 ftw3 xc3.

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Technical Details and Broader Compatibility Still Unclear

While Manticore has confirmed the 14× speed increase, detailed technical explanations of how the optimization was achieved remain limited. It is also unclear whether this improvement applies universally across all models and hardware configurations or is specific to certain setups. Additionally, the long-term stability and compatibility of the new ONNX pipeline are still to be evaluated through broader testing.

ASUS ROG Astral GeForce RTX 5090 White OC Edition GPU, 32GB GDDR7, 3352 AI Tops, DLSS 4, 512-bit, DP 2.1b x3, HDMI 2.1b x2, AI Content Creation, LLM Inference, with GPU Holder

ASUS ROG Astral GeForce RTX 5090 White OC Edition GPU, 32GB GDDR7, 3352 AI Tops, DLSS 4, 512-bit, DP 2.1b x3, HDMI 2.1b x2, AI Content Creation, LLM Inference, with GPU Holder

[3352 AI TOPS, 5th Gen Tensor Cores, AI Content Creation] Accelerate AI-powered photo and video workflows like upscaling,…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps Include Broader Testing and Community Adoption

Manticore plans to release detailed technical documentation and benchmarks in the coming weeks to validate and demonstrate the performance gains. The company also intends to gather feedback from early adopters and integrate further optimizations. Industry observers expect that this breakthrough could encourage other open-source projects to pursue similar enhancements, fostering a wave of performance improvements across AI tools.

Hewlett Packard Enterprise ProLiant DL325 Gen11 Rack Server w/one AMD EPYC 9354P Processor, 3.25GHz 32‑core 1P 64GB‑R MR408i‑o 8SFF 800W PS (HPE Smart Choice P72990-005)

Hewlett Packard Enterprise ProLiant DL325 Gen11 Rack Server w/one AMD EPYC 9354P Processor, 3.25GHz 32‑core 1P 64GB‑R MR408i‑o 8SFF 800W PS (HPE Smart Choice P72990-005)

HPE ProLiant DL325 Gen11 – P72990-005 – SMART CHOICE MODEL – HIGH PERFORMANCE FOR DATA-INTENSIVE WORKLOADS Preconfigured and…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

What specific changes did Manticore make to achieve the speed increase?

Manticore has redesigned its ONNX processing pipeline, optimizing data flow, reducing bottlenecks, and leveraging hardware more efficiently. However, detailed technical specifics have not yet been publicly disclosed.

Does this speed improvement apply to all models and hardware?

It is not yet clear if the 14× speed boost applies universally across different models and hardware configurations. Further testing and detailed documentation are expected to clarify this.

How will this impact users of Manticore?

Users can expect faster embedding generation, leading to lower latency and operational costs, especially in large-scale AI applications like search and recommendation systems.

When will detailed benchmarks and technical explanations be available?

Manticore has announced plans to release further technical documentation and benchmarks in the coming weeks, which will provide more clarity on the performance gains.

Could this lead to broader industry changes?

Yes, if the performance improvements are validated and adopted widely, it could set new standards for open-source AI tools and influence future optimization efforts across the industry.

Source: hn

You May Also Like

Cutrova: Edit the Words, Not the Timeline

Cutrova introduces a local-first, transcript-based video editing tool that simplifies post-production by editing text instead of timelines, emphasizing privacy and control.

Process 4 Billion Pixels Per Second from 16 DIY Cameras for the Best V-Tubing Rig Ever

A maker creates a high-speed optical motion capture system using 16 DIY IR cameras, capturing 4 billion pixels per second for advanced V-tubing applications.

Ford rehires ‘gray beard’ engineers after AI falls short

Ford brings back experienced engineers to improve quality after AI systems failed to meet standards, aiming to save $1B and boost quality rankings.

In the Weights is your new AI-centric vanity search

In the Weights is a new AI-centric vanity search that measures how well models recall individuals, sparking curiosity and debate about digital immortality.