Machine learning tames huge data sets

LOS ALAMOS, N.M., Sept. 11, 2023 — A machine-learning algorithm demonstrated the potential to course of data that exceeds a pc’s out there reminiscence by figuring out an enormous data set’s key options and dividing them into manageable batches that don’t choke pc {hardware}. Developed at Los Alamos National Laboratory, the algorithm set a world file for factorizing huge data sets throughout a check run on Oak Ridge National Laboratory’s Summit, the world’s fifth-fastest supercomputer.Equally environment friendly on laptops and supercomputers, the extremely scalable algorithm solves {hardware} bottlenecks that stop processing info from data-rich functions in most cancers analysis, satellite tv for pc imagery, social media networks, nationwide safety science and earthquake analysis, to call just some.“We developed an ‘out-of-memory’ implementation of the non-negative matrix factorization methodology that permits you to factorize bigger data sets than beforehand doable on a given {hardware},” stated Ismael Boureima, a computational physicist at Los Alamos National Laboratory. Boureima is first writer of the paper in The Journal of Supercomputing on the record-breaking algorithm. “Our implementation merely breaks down the large data into smaller items that may be processed with the out there sources. Consequently, it’s a great tool for maintaining with exponentially rising data sets.”“Traditional data evaluation calls for that data match inside reminiscence constraints. Our method challenges this notion,” stated Manish Bhattarai, a machine learning scientist at Los Alamos and co-author of the paper. “We have launched an out-of-memory resolution. When the data quantity exceeds the out there reminiscence, our algorithm breaks it down into smaller segments. It processes these segments one by one, biking them out and in of the reminiscence. This approach equips us with the distinctive capacity to handle and analyze extraordinarily massive data sets effectively.” The distributed algorithm for contemporary and heterogeneous high-performance pc methods may be helpful on {hardware} as small as a desktop pc, or as massive and sophisticated as Chicoma, Summit or the upcoming Venado supercomputers, Boureima stated.“The query is not whether or not it’s doable to factorize a bigger matrix, reasonably how lengthy is the factorization going to take,” Boureima stated.The Los Alamos implementation takes benefit of {hardware} options akin to GPUs to speed up computation and quick interconnect to effectively transfer data between computer systems. At the identical time, the algorithm effectively will get a number of duties completed concurrently.Non-negative matrix factorization is one other installment of the high-performance algorithms developed underneath the SmartTensors venture at Los Alamos.In machine learning, non-negative matrix factorization can be utilized as a type of unsupervised learning to drag that means from data, Boureima stated. “That’s crucial for machine learning and data analytics as a result of the algorithm can determine explainable latent options within the data which have a specific that means to the person.”The record-breaking runIn the record-breaking run by the Los Alamos workforce, the algorithm processed a 340-terabyte dense matrix and an 11-exabyte sparse matrix, utilizing 25,000 GPUs.  “We’re reaching exabyte factorization, which nobody else has completed, to our information,” stated Boian Alexandrov, a co-author of the brand new paper and a theoretical physicist at Los Alamos who led the workforce that developed the SmartTensors synthetic intelligence platform.Decomposing or factoring data is a specialised data-mining approach geared toward extracting pertinent info, simplifying the data into comprehensible codecs.Bhattarai additional emphasised the scalability of their algorithm, remarking, “In distinction, typical strategies usually grapple with bottlenecks, primarily as a result of lag in data switch between a pc’s processors and its reminiscence.”“We additionally confirmed you don’t essentially want huge computer systems,” Boureima stated. “Scaling to 25,000 GPUs is nice if you happen to can afford it, however our algorithm will likely be helpful on desktop computer systems for one thing you couldn’t course of earlier than.”The paper: “Distributed Out-of-Memory NMF on CPU/GPU Architectures.” The Journal of Supercomputing. Link: DOI: 10.1007/s11227-023-05587-4The funding: This analysis was funded by DNN R&D and by the Laboratory Directed Research and Development program at Los Alamos National Laboratory.LA-UR-23-29923

Recommended For You