Cambridge Unveils Quantum Natural Language Processing (QNLP) Toolkit
(By Daniel O’Shea) Cambridge Quantum is due to be combined with Honeywell’s quantum computing business in a new venture, and the value CQ is bringing to the table continues to increase.
The U.K. company announced Wednesday the release of what it claimed is the first toolkit and library for Quantum Natural Language Processing (QNLP), a technology discipline that CQ Chief Scientist Bob Coecke and fellow researchers have been pursuing for more than a decade.
Called lambeq, after the late mathematician and linguist Joachim Lambek, the new toolkit essentially can convert sentences into quantum circuits, introducing compositional meaning structure, or semantics, to QNLP. This is a major step beyond the traditional “bag-of-words” approach to working with NLP in which computers leverage word meaning without consideration of grammar, word order or sentence structure.
With this advancement, CQ said it believes that lambeq can accelerate the development of practical, real-world QNLP applications, such as automated dialogue, text mining, language translation, text-to-speech, language generation and bioinformatics.
QNLP also is an area which CQ itself has had particularly heavy focus on over the last two years, sponsoring (along with IBM) the groundbreaking QNLP 2019 conference at the University of Oxford. Prior to this week’s announcement, a CQ team led by Coecke, senior scientist Dimitrios Kartsaklis and others last year demonstrated how natural language could be processed on noisy intermediate-scale quantum (NISQ) devices, a huge precursor to establishing the new toolkit and to making QNLP a commercial reality.
This week’s announcement also follows CQ’s recent move to make its TKET quantum software development fully open sourced. The new toolkit will be able to work seamlessly with TKET, which already has hundreds of thousands of users worldwide, to provide QNLP developers with access to the broadest possible range of quantum computers, CQ says. As part of this week’s unveiling, CQ also released lambeq on a fully open-sourced basis.
Further developing QNLP and adapting it to real-world commercial applications should allow a range of companies to build on what already is a rapidly expanding global demand for natural language processing. According to Fortune Business Insights, the global market for NLP is expected to grow from just under $21 billion this year to more than $127 billion by 2028, on a CAGR of more than 29% over that period.
As with many quantum computing innovations, it could be some years before the commercial implications of QNLP are broadly felt as quantum computers continue to gain qubit processing power and evolve fault tolerance capabilities. But CQ has taken significant enabling steps toward that eventuality, steps that also are well-timed to the evolution of AI. The introduction of lambeq should help eliminate barriers to entry for practitioners and researchers who are focused on AI and human-machine interactions, potentially one of the most significant applications of quantum technologies, CQ noted. The company also said QNLP could be applicable to the analysis of symbol sequences that arise in genomics and proteomics.
“In various papers published over the course of the past year,” Coecke said in a CQ statement, “we have not only provided details on how quantum computers can enhance NLP but also demonstrated that QNLP is ‘quantum native,’ meaning the compositional structure governing language is mathematically the same as that governing quantum systems. This will ultimately move the world away from the current paradigm of AI that relies on brute force techniques that are opaque and approximate.”
Medicine in general represents a key area of application for QNLP. Merck Group, a launch partner and early adopter of lambeq, recently published a research paper on how its QNLP experiments with the Technical University of Munich proved that “binary classification tasks for sentences using QNLP techniques can achieve results comparable even at this stage to existing classical methods,” said Thomas Ehmer from Merck’s IT Healthcare Innovation Incubator and co-founder of the Quantum Computing Interest Group. “Clearly, the infrastructure around quantum computing will need to advance before these techniques can be employed commercially. Critically, we can see how the approach employed in QNLP opens the route towards explainable AI, and thus to more accurate intelligence that is also accountable – which is critical in medicine.”
CQ’s Kartsalkis added, “There is a lot of interesting theoretical work on QNLP, but theory usually stands at some distance from practice. With lambeq, we give researchers the opportunity to gain hands-on experience on experimental aspects of QNLP, which is currently completely unexplored ground. This is a crucial step towards reaching the point where practical, real-world NLP applications on quantum hardware become a reality.”
The lambeq toolkit has been released as a conventional Python repository on GitHub and is available here: https://github.com/CQCL/lambeq. The quantum circuits generated by lambeq have thus far been executed and implemented on IBM quantum computers and Honeywell Quantum Solutions’ H series devices, CQ said. The toolkit is introduced by a technical report uploaded on arxiv available here: https://arxiv.org/abs/2110.04236.