Recently, ARC-TS and the School of Information used the new Big Data platform Cavium ThunderX to reduce run times of the school’s analytics pipeline by a factor of 10, taking overall run time from almost a week to only a few hours.
Cavium ThunderX is a free-to-use Big Data platform for researchers at U-M. Currently ThunderX supports mapreduce, Spark, Python, Hive, RMR, R and other common Hadoop ecosystem tools. Cavium is made available by generous alumni support and the Data Science Initiative.
Current Cavium ThunderX configuration has 3 petabytes of storage; 4,800 ARM cores; and 25 terabytes of main memory on a 40/100 Gbps network. Training and consulting is available for those looking to take their first steps into Big Data.
To get started, contact ARC-TS at firstname.lastname@example.org.