In the
realm of data storage and processing, there are two major technologies which
we deal with every day. On one side, we have relational data that is stored
inside SQL Server, and on the other side, non-relational or very large
datasets that do not fit the relational model which are stored on big data
clusters like Hadoop or Spark. This introduces challenges when having to
combine datasets across both these technologies. SQL Server was never built
to process huge datasets in a distributed fashion or to handle non-relational
data very well, meaning that in many cases you would have to resort to
bringing your relational data into Hadoop or Spark clusters. SQL Server 2019
has the answer with Big Data Clusters: it combines SQL Server with HDFS and
Spark! In this session we are going to explore the capabilities of the
exciting new feature. How does it work and how can we work with datasets that
are non-relational? |