Skip to content
Innovation

Bridging Blockchain and Data Science: An Inside Look at an Educational Guide

By Alamira Jouman Hajjar - Sr. Research & Editorial Manager

Bridging Blockchain and Data Science: An Inside Look at an Educational Guide

Blockchain technology is becoming an increasingly important area for data scientists — but for those who are unfamiliar with the industry, learning the ropes can be daunting.

That’s why RootstockLabs data lead Gabriela Castillo Areco has written a brand-new book, published by Packt Publishing, that’s designed to help people in the field who may be unfamiliar with how distributed ledgers and cryptocurrencies work.

Get a copy of the book here.

Challenges of self-education in data science

Gabriela says her own journey helped her realize there isn’t a single text that educates data scientists from the very beginning.

The most common challenges she believes are:

  • Lack of structured learning in the absence of systematic educational resource
  • The scattered nature of the available resources
  • Tutorial limitations due to outdated information and lack of foundational explanations
  • And the need for a centralized, comprehensive source of learning

“There were many resources scattered along the internet but without a clear data science program attached to it,” she explains. “Valuable resources were spread in blogs, tweets, podcasts, YouTube videos… which makes all this very overwhelming.”

And while there are how-to tutorials out there, Gabriela argues that they often fail to cover the rationale behind each task. This means that, whenever an ecosystem gets upgraded, the resource quickly falls out of date and leaves the data scientist more confused than before.

How the book helps

The book begins by focusing on data structure, gathering, and analytics — before examining the machine learning applications that are common in the ecosystem.

The author continues to explain “The book takes the pedagogical approach that common data science programs follow and applies it to Web3 data. This way the student learns the basic concepts of data science applied to Web3 data. At the end of each chapter, I also complement with many traditional data science reference resources for the reader to continue their journey.”

Data Science for Web3 aims to offer a solid understanding of the essentials when it comes to blockchain technology — including how transactions work, how blocks are added to networks, and the digital assets at the beating heart of this infrastructure. Her book goes on to show how data scientists can identify and leverage reliable sources of on-chain and off-chain information, as well as how to build datasets specifically related to non-fungible tokens and decentralized finance protocols.

Who is this book for

Complete with step-by-step instructions, Gabriela’s work is designed to help readers navigate the complexities of this fast-moving landscape, with unparalleled insights into the tools and strategies needed to be a successful data scientist in a Web3 world. Exclusive interviews with industry leaders from Dune Analytics, Dragonfly, Flipside Analytics, and Sovryn shed light on the approaches and strategies they deploy when working with blockchain data — and there are also practical tips on what to expect when applying for jobs in the space.

The book, as the author explains, is divided into two parts: the first part is for data and business analysts to understand and extract insights from the data landscape, while the second part is for data scientists focusing on Web3 applications and solving typical use cases within the ecosystem. These use cases are explored in detail to provide a comprehensive evaluation of the results. 

Gabriela’s goal was to explain the basic concepts of data science, applied to Web3. “At the end of each chapter I also offer traditional data science reference resources for the reader to continue their journey,” she adds.

The future of data science in Web3 

The author believes that there are exciting challenges facing data scientists in Web3, and the blockchain economy remains ripe for expansion. “There is still a lot of value to be unlocked,” she says. “Data has been historically used for forensic purposes — and with the eruption of platforms like Dune Analytics, Footprint Analytics, Covalent and Flipside Analytics, it’s become really easy to get our hands on this data.” 

Looking ahead, Gabriela believes that seamless wallets will soon onboard many users — and this will lead to a wealth of new data to benefit analysts. She also predicts this could unlock powerful use cases when it comes to lending and borrowing. Her past blog posts have set out the argument for Web3 credit scoring that would be based on a multitude of factors — including the age of an address, the total assets held in a wallet, and whether a user has fallen victim to liquidation events in the past.

At the time, she wrote that democratizing access to credit data would boost transparency, increase financial inclusion by dismantling geographical boundaries, and enhance trust across the ecosystem.

Data science would play a starring role in making this a reality — and help DeFi protocols make a concerted shift away from overcollateralized lending, where the security that a borrower provides exceeds the value of their loan, to the undercollateralized products more commonly seen in the traditional finance landscape.

Gabriela says this book is geared toward data and business analysts who are unfamiliar with Web3 but looking to understand the landscape.

“I think specialists can expect to be applying their knowledge in new use cases as the data grows exponentially and that facilitates the usage of more complex models,” she adds.

Get Gabriela’s book, Data Science for Web3, here.

About the author

Gabriela Castillo Areco is data lead at RootstockLabs, core contributor of the first Bitcoin sidechain called Rootstock. She holds an M.Se, in big data science from the TECNUM School of Engineering. University of Navarra.

With extensive experience in both the business and data facets of blockchain technology.

Gabriela has undertaken roles as a data scientist, machine learing analyst, and blockchain consultant in both large corporations and small ventures

She served as a professor of new crypto businesses at Torcuato di Talla University and collaborated with Crypto Resources academy in building their ” Learn to Eain” segment.

Connect with Gabriella on X or Linkedin.