sparkDownloading. Get Spark from the downloads page of the project website. This documentation is for Spark version 3.5.5. Spark uses Hadoop’s client libraries for HDFS and YARN. DownloadsSpark’s primary abstraction is a distributed collection of items called a Dataset. Datasets can be created from Hadoop InputFormats (such as HDFS files) or by transforming other Datasets.