Other Projects
One of the benefits of Spark's vibrant open-source community is continued innovation that helps extend Spark's capabilities, many of which originated in UC Berkeley's AMPLab. Here is a sampling of some on-going projects in the community (that are still in alpha):
BlinkDB
An approximate query engine for interactive SQL queries in Shark that allows users to trade-off query accuracy for response time. This enables interactive queries over massive data by using data samples and presenting results annotated with meaningful error bars.
Spark R
A package for the R statistical language that enables R-users to leverage Spark functionality interactively from within the R shell. It provides a light-weight frontend to use Apache Spark from R. It exposes the Spark API through the RDD class and allows users to interactively run jobs from the R shell on a cluster.