All Solidigm Solidigm 2024 Sponsored Utilizing Tech

Deploying AI Data Infrastructure in the Datacenter with Ariel Pisetzky of Taboola | Utilizing Tech 07×08

As practical applications of AI are rolled out, they are increasingly being deployed on-premises at scale. We are wrapping up this season of Utilizing Tech with Solidigm focused on AI Data Infrastructure by discussing practical deployment considerations with Ariel Pisetzky, VP of Information Technology and Cyber at Taboola in a discussion with Jeniece Wnorowski and Stephen Foskett. Companies like Taboola are built on data and have been deploying AI-driven applications for years. Generative AI brings new capabilities but is part of a spectrum of solutions that leverage data to produce results for customers. As applications mature, many companies are looking to bring them back on-premises, and this trend will likely accelerate given the cost of AI infrastructure as-a-service offerings. Owned infrastructure can also deliver beyond expected lifespans, representing a potential windfall for businesses that can continue to use deprerciated hardware. This is especially true of large flash drives, which have proven much more reliable than initially predicted. Although it is tempting to buy the biggest, fastest infrastructure to extend the lifespan of equipment, Pisetzky recommends focusing on equipment that is flexible and can be re-purposed in other ways in the future. Server storage is unique in that it is easy to upgrade and replace it in place, even hot-swapping drives, and large lives have a very long lifespan.

Apple Podcasts | Spotify | Overcast | More Audio Links | UtilizingTech.com


AI Recommendation System: Getting behind the Scenes with Taboola

One of the major trends coming out of the AI industry is algorithmic curation. 80% of what viewers watch on OTT platforms, or readers read on news apps are found through the platforms’ recommendation systems.

A Personal Recommender

AI recommendation system executes on the idea that content-based recommendation is built on. They look at users’ interests and activities over a period of time and produce rows of personalized, hyper-specific recommendations to help them find the next thing they may like.

Publishing houses, streaming platforms, retail e-stores and video sharing websites, all leverage integrated recommendation engines on their platforms. These algorithms process billions of user profiles, identifying their tastes and making content recommendations every day.

This season of Utilizing Tech wraps up with a conversation about AI recommendation system with Taboola and Solidigm. Taboola runs an advertising platform that makes content recommendations to billions of active users daily.

But Taboola is not a name average users on the Internet are familiar with. “We’re not a consumer-facing product. So many people might use our products on a daily basis, but not be aware of it,” says Ariel Pisetzky, VP of information technology and cyber.

“We are a content discovery platform. That means that we reside on many of the publishers that you read on a daily basis and love and receive content from.”

Taboola works in the lower layers of applications, helping publishers and advertisers match their content closely with what customers are looking for.

The platform serves 4 billion webpages, recommending upwards of 40 billion articles and stories to users on those platforms every day. “We provide content recommendations for the next thing that you can read, bringing that content to you without actually knowing who you are, without you logging into our service, and without you providing us any specifics about yourself.”

Taboola does this by corresponding the article text to the readers’ interests and reading habits. The recommendations help discover content they may not have chosen initially.

The LLMs behind this use deep learning and natural language processing to sift through nuanced threads underlying the content, and fit them into taste groups.

“You need to understand as a service provider for publishers, how to recognize the article that we reside on, how to identify the users coming into that specific article, the relevant content, and where that user browsing arc is going to end,” he says.

Taboola, like the rest of the industry, leverages artificial intelligence for this. “We have been taking advantage of LLMs and different generative AI technologies to provide additional tools for editors and advertisers to curate article names, the article itself, and imagery that you might get on your beloved websites.”

On-Prem, the Smarter Option for AI

For Taboola, all of that behind-the-scenes data crunching happens at a private data center.

“Storing these vast amounts of data and processing them and creating value out of them is something that when you own the data, you have a lot of advantages, over putting it somewhere in the cloud,” comments Pisetzky.

At private data centers, there is tremendous opportunity to tune and optimize the infrastructure narrowly for the job at hand.

“Owning all of the compute, data storage and networking has proven to be extremely advantageous for training and inferencing,” he adds.

For instance, Taboola has been able to draw out much more performance from CPUs than they offer out-of-the-box with simple optimization techniques like updating code from the vendor libraries.

Even more can be done. “You have multiple layers of optimization – using CPUs in higher capacity, making sure that all of your GPUs and CPUs are fully utilized.”

These help speed up computation, but also significantly reduce the total cost of ownership (TCO) on the whole.

For a slimmer footprint, Taboola uses NVMe drives. “Today the NVMe interface and SSDs provide so much performance, and when you understand the geometry of the drives and start to think about the specific types of space with different drive geometries that fit in different places, and optimize that for your read size, you suddenly get this boost of performance where you can do so much more with your on-prem hardware and investment in CapEx.”

Pisetzky highlights the value Solidigm SSDs bring to this infrastructure. Solidigm drives, besides being highly reliable, also promise great value for money.

“The drives do not fail beyond the expected MTBF. You’re getting a good bargain for drives that maintain their value beyond their three-year depreciation,” he says.

A big reason for that is high capacity that some of Solidigm’s drives pack. The more terabytes a drive has, the more area it has to spread out the write operations, making it less prone to failure from constant activity.

With AI workloads making CPUs work harder, a good storage solution is all the more important to get high bandwidth and more life out of the drives.

“The beautiful thing with Solidigm drives is the connection to the OS and the tooling that it provides. We can manage the drives remotely through the OS providing us with all the serial numbers and asset management information needed to do this in a responsible way.”

At the edge where infrastructure is limited, inferencing can be a difficult prospect if enterprises do not have servers at the ready in the front-end edge data centers. Having a stack handy allows them to ship out servers on demand to the back-end centers. This way, “if you have a data center automation stack, you can provide storage, where you need it, when you need it.”

Pisetzky advises organizations to focus on having a balanced infrastructure rather than rushing to acquire the most expensive equipment on the shelf. One of the advantages on-premise data centers put on the table is the flexibility to right-size the infrastructure to a t.

“We really love to see how year-over-year, we optimize our use cases for storage, for CPUs and for network, and bring them together to a place where our developers now have so much raw power at their fingertips that they just do not want to go to the cloud for many of the day-to-day operations.”

The rising electricity demands in data centers is another reason to keep the footprint to a minimum, and opt for components that are designed to be optimally energy-efficient.

“When you’re in the cloud, you get the great carbon footprint of the clouds that are in renewables, you get wonderful e-waste management and so on and so forth. When you are on-prem, you need to control your own destiny,” Pisetzky reminds.

While there is no way to cap the dizzying amount of power that accelerators burn, there is certainly an opportunity to push down the envelop with energy-rated storage solutions that underpin the system.

Talking about the specific advantages of Solidigm high-density drives, he says, “When you look at the larger capacities that still provide amazing performance in terms of the level of IOPS, their thermal footprint doesn’t warm up your data center and their energy levels are in use only when you are at full write mode.”

Special thanks to Solidigm for sponsoring this season of Utilizing Tech, and to Taboola for joining the discussion.

Find more about Taboola at their engineering blog. To check’s Solidigm’s SSD portfolio, head over to their website, or catch their presentations from the past AI Field Day event. Keep your eyes peeled for the next episode of Utilizing Tech coming soon on your favorite podcasting app.


Podcast Information:

Stephen Foskett is the Organizer of the Tech Field Day Event Series President of the Tech Field Day Business Unit, now part of The Futurum Group. Connect with Stephen on LinkedIn or on X/Twitter and read more on the Gestalt IT website.

Jeniece Wnorowski is the Datacenter Product Marketing Manager at Solidigm. You can connect with Jeniece on LinkedIn and learn more about Solidigm and their AI efforts on their dedicated AI landing page or watch their AI Field Day presentations from the recent event.

Ariel Pisetzky is the VP of Information Technology and Cyber at Taboola. You can connect with Ariel on LinkedIn. Learn more about Taboola by heading to their website.

Learn More about Taboola:


Thank you for listening to Utilizing Tech with Season 7 focusing on AI Data Infrastructure. If you enjoyed this discussion, please subscribe in your favorite podcast application and consider leaving us a rating and a nice review on Apple Podcasts or Spotify. This podcast was brought to you by Solidigm and by Tech Field Day, now part of The Futurum Group. For show notes and more episodes, head to our dedicated Utilizing Tech Website or find us on X/Twitter and Mastodon at Utilizing Tech.

About the author

Sulagna Saha

Sulagna Saha is a writer at Gestalt IT where she covers all the latest in enterprise IT. She has written widely on miscellaneous topics. On gestaltit.com she writes about the hottest technologies in Cloud, AI, Security and sundry.

A writer by day and reader by night, Sulagna can be found busy with a book or browsing through a bookstore in her free time. She also likes cooking fancy things on leisurely weekends. Traveling and movies are other things high on her list of passions. Sulagna works out of the Gestalt IT office in Hudson, Ohio.

Leave a Comment