package app
- Alphabetic
- Public
- All
Type Members
- case class AllApps(apps: Seq[String]) extends Product with Serializable
-
trait
BaseEnv extends Env
Environment which provides a base path into which the application can write its data Unless overridden, paths will be of the form {uri}/data/{environment}/{project}/{branch} where environment is the logical environment (e.g.
Environment which provides a base path into which the application can write its data Unless overridden, paths will be of the form {uri}/data/{environment}/{project}/{branch} where environment is the logical environment (e.g. dev, test), project is the name of the application and branch is the Git branch
N.B when environment is 'prod', the branch is omitted from the path as we assume it will always be master
e.g. hdfs:///data/dev/my_project/feature_abc, hdfs:///data/prod/my_project
-
trait
Env extends Logging
Environment defining a sandbox in which an application can write
- case class EnvironmentAction(action: String, appClassName: String) extends Product with Serializable
-
trait
HiveEnv extends BaseEnv
Environment which provides databases.
Environment which provides databases. By default, there will be a single database of the form {environment}_{project}_{branch} where environment is the logical environment (e.g. dev, test), project is the name of the application and branch is the Git branch
N.B when environment is 'prod', the branch is omitted from the database name as we assume it will always be master
e.g. dev_my_project_feature_abc, prod_my_project
- case class SingleAppConfig(appClassName: String, dependencies: Seq[String] = Nil) extends Product with Serializable
-
abstract
class
SparkApp[E <: Env] extends AnyRef
During the development lifecycle of Spark applications, it is useful to create sandbox environments comprising paths and Hive databases etc.
During the development lifecycle of Spark applications, it is useful to create sandbox environments comprising paths and Hive databases etc. which are tied to specific logical environments (e.g. dev, test, prod) and feature development (i.e Git branches). e.g. when working on a feature called new_feature for a project called my_project, the application should write its data to paths under /data/dev/my_project/new_feature/ and create tables in a database called dev_my_project_new_feature (actual implementation of what these environments should look like can be defined by extending Env or one of its subclasses - the final implementation should be a case class whose values define the environment i.e env, branch etc.)
This is a generic Spark Application which uses an implementation of Env to generate application-specific configuration and subsequently parse this configuration into a case class to be used for the application logic.
- E
the type of the Env implementation (must be a case class)
-
abstract
class
WaimakApp[E <: Env with WaimakEnv] extends SparkApp[E]
This is a SparkApp specifically for applications using Waimak
-
trait
WaimakEnv extends AnyRef
Trait for defining Waimak-app specific configuration
Value Members
-
object
EnvironmentManager
Performs create and cleanup operations for the Env implementation used by a provided implementation of SparkApp The following configuration values should be present in the SparkSession:
Performs create and cleanup operations for the Env implementation used by a provided implementation of SparkApp The following configuration values should be present in the SparkSession:
spark.waimak.environment.appClassName: the application class to use (must extend SparkApp) spark.waimak.environment.action: the environment action to perform (create or cleanup)
The Env implementation expects configuration values prefixed with spark.waimak.environment.
-
object
MultiAppRunner
Allows multiple Spark applications to be run in a single main method whilst obeying configured dependency constraints.
Allows multiple Spark applications to be run in a single main method whilst obeying configured dependency constraints. The following configuration values should be present in the SparkSession:
spark.waimak.apprunner.apps: a comma-delimited list of the names (identifiers) of all of the applications being run (e.g. myapp1,myapp2)
spark.waimak.apprunner.{appname}.appClassName: for each application, the application class to use (must extend SparkApp) (e.g. spark.waimak.apprunner.myapp1.appClassName = com.example.MyWaimakApp)
spark.waimak.apprunner.{appname}.dependencies: for each application, an optional comma-delimited list of dependencies. If omitted, the application will have no dependencies and will not wait for other apps to finish before starting execution. Dependencies must match the names provided in spark.waimak.apprunner.apps (e.g. spark.waimak.apprunner.myapp1.dependencies = myapp2)
The Env implementation used by the provided SparkApp implementation expects configuration values prefixed with: spark.waimak.environment.{appname}.