Job
Manages a job resource within a Dataproc cluster within GCE. For more information see the official dataproc documentation.
!> Note: This resource does not support ‘update’ and changing any attributes will cause the resource to be recreated.
Create a Job Resource
new Job(name: string, args: JobArgs, opts?: CustomResourceOptions);def Job(resource_name, opts=None, force_delete=None, hadoop_config=None, hive_config=None, labels=None, pig_config=None, placement=None, project=None, pyspark_config=None, reference=None, region=None, scheduling=None, spark_config=None, sparksql_config=None, __props__=None);public Job(string name, JobArgs args, CustomResourceOptions? opts = null)- name string
- The unique name of the resource.
- args JobArgs
- The arguments to resource properties.
- opts CustomResourceOptions
- Bag of options to control resource's behavior.
- resource_name str
- The unique name of the resource.
- opts ResourceOptions
- A bag of options that control this resource's behavior.
- ctx Context
- Context object for the current deployment.
- name string
- The unique name of the resource.
- args JobArgs
- The arguments to resource properties.
- opts ResourceOption
- Bag of options to control resource's behavior.
- name string
- The unique name of the resource.
- args JobArgs
- The arguments to resource properties.
- opts CustomResourceOptions
- Bag of options to control resource's behavior.
Job Resource Properties
To learn more about resource properties and how to use them, see Inputs and Outputs in the Programming Model docs.
Inputs
The Job resource accepts the following input properties:
- Placement
Job
Placement Args The config of job placement.
- Force
Delete bool By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.
- Hadoop
Config JobHadoop Config Args The config of Hadoop job
- Hive
Config JobHive Config Args The config of hive job
- Labels Dictionary<string, string>
The list of labels (key/value pairs) to add to the job.
- Pig
Config JobPig Config Args The config of pag job.
- Project string
The project in which the
clustercan be found and jobs subsequently run against. If it is not provided, the provider project is used.- Pyspark
Config JobPyspark Config Args The config of pySpark job.
- Reference
Job
Reference Args The reference of the job
- Region string
The Cloud Dataproc region. This essentially determines which clusters are available for this job to be submitted to. If not specified, defaults to
global.- Scheduling
Job
Scheduling Args Optional. Job scheduling configuration.
- Spark
Config JobSpark Config Args The config of the Spark job.
- Sparksql
Config JobSparksql Config Args The config of SparkSql job
- Placement
Job
Placement The config of job placement.
- Force
Delete bool By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.
- Hadoop
Config JobHadoop Config The config of Hadoop job
- Hive
Config JobHive Config The config of hive job
- Labels map[string]string
The list of labels (key/value pairs) to add to the job.
- Pig
Config JobPig Config The config of pag job.
- Project string
The project in which the
clustercan be found and jobs subsequently run against. If it is not provided, the provider project is used.- Pyspark
Config JobPyspark Config The config of pySpark job.
- Reference
Job
Reference The reference of the job
- Region string
The Cloud Dataproc region. This essentially determines which clusters are available for this job to be submitted to. If not specified, defaults to
global.- Scheduling
Job
Scheduling Optional. Job scheduling configuration.
- Spark
Config JobSpark Config The config of the Spark job.
- Sparksql
Config JobSparksql Config The config of SparkSql job
- placement
Job
Placement The config of job placement.
- force
Delete boolean By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.
- hadoop
Config JobHadoop Config The config of Hadoop job
- hive
Config JobHive Config The config of hive job
- labels {[key: string]: string}
The list of labels (key/value pairs) to add to the job.
- pig
Config JobPig Config The config of pag job.
- project string
The project in which the
clustercan be found and jobs subsequently run against. If it is not provided, the provider project is used.- pyspark
Config JobPyspark Config The config of pySpark job.
- reference
Job
Reference The reference of the job
- region string
The Cloud Dataproc region. This essentially determines which clusters are available for this job to be submitted to. If not specified, defaults to
global.- scheduling
Job
Scheduling Optional. Job scheduling configuration.
- spark
Config JobSpark Config The config of the Spark job.
- sparksql
Config JobSparksql Config The config of SparkSql job
- placement
Dict[Job
Placement] The config of job placement.
- force_
delete bool By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.
- hadoop_
config Dict[JobHadoop Config] The config of Hadoop job
- hive_
config Dict[JobHive Config] The config of hive job
- labels Dict[str, str]
The list of labels (key/value pairs) to add to the job.
- pig_
config Dict[JobPig Config] The config of pag job.
- project str
The project in which the
clustercan be found and jobs subsequently run against. If it is not provided, the provider project is used.- pyspark_
config Dict[JobPyspark Config] The config of pySpark job.
- reference
Dict[Job
Reference] The reference of the job
- region str
The Cloud Dataproc region. This essentially determines which clusters are available for this job to be submitted to. If not specified, defaults to
global.- scheduling
Dict[Job
Scheduling] Optional. Job scheduling configuration.
- spark_
config Dict[JobSpark Config] The config of the Spark job.
- sparksql_
config Dict[JobSparksql Config] The config of SparkSql job
Outputs
All input properties are implicitly available as output properties. Additionally, the Job resource produces the following output properties:
- Driver
Controls stringFiles Uri If present, the location of miscellaneous control files which may be used as part of job setup and handling. If not present, control files may be placed in the same location as driver_output_uri.
- Driver
Output stringResource Uri A URI pointing to the location of the stdout of the job’s driver program.
- Id string
- The provider-assigned unique ID for this managed resource.
- Status
Job
Status The status of the job.
- Driver
Controls stringFiles Uri If present, the location of miscellaneous control files which may be used as part of job setup and handling. If not present, control files may be placed in the same location as driver_output_uri.
- Driver
Output stringResource Uri A URI pointing to the location of the stdout of the job’s driver program.
- Id string
- The provider-assigned unique ID for this managed resource.
- Status
Job
Status The status of the job.
- driver
Controls stringFiles Uri If present, the location of miscellaneous control files which may be used as part of job setup and handling. If not present, control files may be placed in the same location as driver_output_uri.
- driver
Output stringResource Uri A URI pointing to the location of the stdout of the job’s driver program.
- id string
- The provider-assigned unique ID for this managed resource.
- status
Job
Status The status of the job.
- driver_
controls_ strfiles_ uri If present, the location of miscellaneous control files which may be used as part of job setup and handling. If not present, control files may be placed in the same location as driver_output_uri.
- driver_
output_ strresource_ uri A URI pointing to the location of the stdout of the job’s driver program.
- id str
- The provider-assigned unique ID for this managed resource.
- status
Dict[Job
Status] The status of the job.
Look up an Existing Job Resource
Get an existing Job resource’s state with the given name, ID, and optional extra properties used to qualify the lookup.
public static get(name: string, id: Input<ID>, state?: JobState, opts?: CustomResourceOptions): Jobstatic get(resource_name, id, opts=None, driver_controls_files_uri=None, driver_output_resource_uri=None, force_delete=None, hadoop_config=None, hive_config=None, labels=None, pig_config=None, placement=None, project=None, pyspark_config=None, reference=None, region=None, scheduling=None, spark_config=None, sparksql_config=None, status=None, __props__=None);public static Job Get(string name, Input<string> id, JobState? state, CustomResourceOptions? opts = null)- name
- The unique name of the resulting resource.
- id
- The unique provider ID of the resource to lookup.
- state
- Any extra arguments used during the lookup.
- opts
- A bag of options that control this resource's behavior.
- resource_name
- The unique name of the resulting resource.
- id
- The unique provider ID of the resource to lookup.
- name
- The unique name of the resulting resource.
- id
- The unique provider ID of the resource to lookup.
- state
- Any extra arguments used during the lookup.
- opts
- A bag of options that control this resource's behavior.
- name
- The unique name of the resulting resource.
- id
- The unique provider ID of the resource to lookup.
- state
- Any extra arguments used during the lookup.
- opts
- A bag of options that control this resource's behavior.
The following state arguments are supported:
- Driver
Controls stringFiles Uri If present, the location of miscellaneous control files which may be used as part of job setup and handling. If not present, control files may be placed in the same location as driver_output_uri.
- Driver
Output stringResource Uri A URI pointing to the location of the stdout of the job’s driver program.
- Force
Delete bool By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.
- Hadoop
Config JobHadoop Config Args The config of Hadoop job
- Hive
Config JobHive Config Args The config of hive job
- Labels Dictionary<string, string>
The list of labels (key/value pairs) to add to the job.
- Pig
Config JobPig Config Args The config of pag job.
- Placement
Job
Placement Args The config of job placement.
- Project string
The project in which the
clustercan be found and jobs subsequently run against. If it is not provided, the provider project is used.- Pyspark
Config JobPyspark Config Args The config of pySpark job.
- Reference
Job
Reference Args The reference of the job
- Region string
The Cloud Dataproc region. This essentially determines which clusters are available for this job to be submitted to. If not specified, defaults to
global.- Scheduling
Job
Scheduling Args Optional. Job scheduling configuration.
- Spark
Config JobSpark Config Args The config of the Spark job.
- Sparksql
Config JobSparksql Config Args The config of SparkSql job
- Status
Job
Status Args The status of the job.
- Driver
Controls stringFiles Uri If present, the location of miscellaneous control files which may be used as part of job setup and handling. If not present, control files may be placed in the same location as driver_output_uri.
- Driver
Output stringResource Uri A URI pointing to the location of the stdout of the job’s driver program.
- Force
Delete bool By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.
- Hadoop
Config JobHadoop Config The config of Hadoop job
- Hive
Config JobHive Config The config of hive job
- Labels map[string]string
The list of labels (key/value pairs) to add to the job.
- Pig
Config JobPig Config The config of pag job.
- Placement
Job
Placement The config of job placement.
- Project string
The project in which the
clustercan be found and jobs subsequently run against. If it is not provided, the provider project is used.- Pyspark
Config JobPyspark Config The config of pySpark job.
- Reference
Job
Reference The reference of the job
- Region string
The Cloud Dataproc region. This essentially determines which clusters are available for this job to be submitted to. If not specified, defaults to
global.- Scheduling
Job
Scheduling Optional. Job scheduling configuration.
- Spark
Config JobSpark Config The config of the Spark job.
- Sparksql
Config JobSparksql Config The config of SparkSql job
- Status
Job
Status The status of the job.
- driver
Controls stringFiles Uri If present, the location of miscellaneous control files which may be used as part of job setup and handling. If not present, control files may be placed in the same location as driver_output_uri.
- driver
Output stringResource Uri A URI pointing to the location of the stdout of the job’s driver program.
- force
Delete boolean By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.
- hadoop
Config JobHadoop Config The config of Hadoop job
- hive
Config JobHive Config The config of hive job
- labels {[key: string]: string}
The list of labels (key/value pairs) to add to the job.
- pig
Config JobPig Config The config of pag job.
- placement
Job
Placement The config of job placement.
- project string
The project in which the
clustercan be found and jobs subsequently run against. If it is not provided, the provider project is used.- pyspark
Config JobPyspark Config The config of pySpark job.
- reference
Job
Reference The reference of the job
- region string
The Cloud Dataproc region. This essentially determines which clusters are available for this job to be submitted to. If not specified, defaults to
global.- scheduling
Job
Scheduling Optional. Job scheduling configuration.
- spark
Config JobSpark Config The config of the Spark job.
- sparksql
Config JobSparksql Config The config of SparkSql job
- status
Job
Status The status of the job.
- driver_
controls_ strfiles_ uri If present, the location of miscellaneous control files which may be used as part of job setup and handling. If not present, control files may be placed in the same location as driver_output_uri.
- driver_
output_ strresource_ uri A URI pointing to the location of the stdout of the job’s driver program.
- force_
delete bool By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.
- hadoop_
config Dict[JobHadoop Config] The config of Hadoop job
- hive_
config Dict[JobHive Config] The config of hive job
- labels Dict[str, str]
The list of labels (key/value pairs) to add to the job.
- pig_
config Dict[JobPig Config] The config of pag job.
- placement
Dict[Job
Placement] The config of job placement.
- project str
The project in which the
clustercan be found and jobs subsequently run against. If it is not provided, the provider project is used.- pyspark_
config Dict[JobPyspark Config] The config of pySpark job.
- reference
Dict[Job
Reference] The reference of the job
- region str
The Cloud Dataproc region. This essentially determines which clusters are available for this job to be submitted to. If not specified, defaults to
global.- scheduling
Dict[Job
Scheduling] Optional. Job scheduling configuration.
- spark_
config Dict[JobSpark Config] The config of the Spark job.
- sparksql_
config Dict[JobSparksql Config] The config of SparkSql job
- status
Dict[Job
Status] The status of the job.
Supporting Types
JobHadoopConfig
- Archive
Uris List<string> HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.
- Args List<string>
The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
- File
Uris List<string> HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
- Jar
File List<string>Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- Logging
Config JobHadoop Config Logging Config Args - Main
Class string The name of the driver’s main class. The jar file containing the class must be in the default CLASSPATH or specified in
jar_file_uris. Conflicts withmain_jar_file_uri- Main
Jar stringFile Uri The HCFS URI of the jar file containing the main class. Examples: ‘gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar’ ‘hdfs:/tmp/test-samples/custom-wordcount.jar’ ‘file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar’. Conflicts with
main_class- Properties Dictionary<string, string>
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- Archive
Uris []string HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.
- Args []string
The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
- File
Uris []string HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
- Jar
File []stringUris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- Logging
Config JobHadoop Config Logging Config - Main
Class string The name of the driver’s main class. The jar file containing the class must be in the default CLASSPATH or specified in
jar_file_uris. Conflicts withmain_jar_file_uri- Main
Jar stringFile Uri The HCFS URI of the jar file containing the main class. Examples: ‘gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar’ ‘hdfs:/tmp/test-samples/custom-wordcount.jar’ ‘file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar’. Conflicts with
main_class- Properties map[string]string
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- archive
Uris string[] HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.
- args string[]
The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
- file
Uris string[] HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
- jar
File string[]Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- logging
Config JobHadoop Config Logging Config - main
Class string The name of the driver’s main class. The jar file containing the class must be in the default CLASSPATH or specified in
jar_file_uris. Conflicts withmain_jar_file_uri- main
Jar stringFile Uri The HCFS URI of the jar file containing the main class. Examples: ‘gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar’ ‘hdfs:/tmp/test-samples/custom-wordcount.jar’ ‘file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar’. Conflicts with
main_class- properties {[key: string]: string}
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- archive
Uris List[str] HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.
- args List[str]
The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
- file
Uris List[str] HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
- jar
File List[str]Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- logging
Config Dict[JobHadoop Config Logging Config] - main
Class str The name of the driver’s main class. The jar file containing the class must be in the default CLASSPATH or specified in
jar_file_uris. Conflicts withmain_jar_file_uri- main
Jar strFile Uri The HCFS URI of the jar file containing the main class. Examples: ‘gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar’ ‘hdfs:/tmp/test-samples/custom-wordcount.jar’ ‘file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar’. Conflicts with
main_class- properties Dict[str, str]
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
JobHadoopConfigLoggingConfig
- Driver
Log Dictionary<string, string>Levels
- Driver
Log map[string]stringLevels
- driver
Log {[key: string]: string}Levels
- driver
Log Dict[str, str]Levels
JobHiveConfig
- Continue
On boolFailure Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries. Defaults to false.
- Jar
File List<string>Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- Properties Dictionary<string, string>
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- Query
File stringUri The HCFS URI of the script that contains SQL queries. Conflicts with
query_list- Query
Lists List<string> The list of SQL queries or statements to execute as part of the job. Conflicts with
query_file_uri- Script
Variables Dictionary<string, string> Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).
- Continue
On boolFailure Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries. Defaults to false.
- Jar
File []stringUris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- Properties map[string]string
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- Query
File stringUri The HCFS URI of the script that contains SQL queries. Conflicts with
query_list- Query
Lists []string The list of SQL queries or statements to execute as part of the job. Conflicts with
query_file_uri- Script
Variables map[string]string Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).
- continue
On booleanFailure Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries. Defaults to false.
- jar
File string[]Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- properties {[key: string]: string}
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- query
File stringUri The HCFS URI of the script that contains SQL queries. Conflicts with
query_list- query
Lists string[] The list of SQL queries or statements to execute as part of the job. Conflicts with
query_file_uri- script
Variables {[key: string]: string} Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).
- continue
On boolFailure Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries. Defaults to false.
- jar
File List[str]Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- properties Dict[str, str]
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- query
File strUri The HCFS URI of the script that contains SQL queries. Conflicts with
query_list- query
Lists List[str] The list of SQL queries or statements to execute as part of the job. Conflicts with
query_file_uri- script
Variables Dict[str, str] Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).
JobPigConfig
- Continue
On boolFailure Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries. Defaults to false.
- Jar
File List<string>Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- Logging
Config JobPig Config Logging Config Args - Properties Dictionary<string, string>
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- Query
File stringUri The HCFS URI of the script that contains SQL queries. Conflicts with
query_list- Query
Lists List<string> The list of SQL queries or statements to execute as part of the job. Conflicts with
query_file_uri- Script
Variables Dictionary<string, string> Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).
- Continue
On boolFailure Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries. Defaults to false.
- Jar
File []stringUris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- Logging
Config JobPig Config Logging Config - Properties map[string]string
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- Query
File stringUri The HCFS URI of the script that contains SQL queries. Conflicts with
query_list- Query
Lists []string The list of SQL queries or statements to execute as part of the job. Conflicts with
query_file_uri- Script
Variables map[string]string Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).
- continue
On booleanFailure Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries. Defaults to false.
- jar
File string[]Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- logging
Config JobPig Config Logging Config - properties {[key: string]: string}
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- query
File stringUri The HCFS URI of the script that contains SQL queries. Conflicts with
query_list- query
Lists string[] The list of SQL queries or statements to execute as part of the job. Conflicts with
query_file_uri- script
Variables {[key: string]: string} Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).
- continue
On boolFailure Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries. Defaults to false.
- jar
File List[str]Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- logging
Config Dict[JobPig Config Logging Config] - properties Dict[str, str]
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- query
File strUri The HCFS URI of the script that contains SQL queries. Conflicts with
query_list- query
Lists List[str] The list of SQL queries or statements to execute as part of the job. Conflicts with
query_file_uri- script
Variables Dict[str, str] Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).
JobPigConfigLoggingConfig
- Driver
Log Dictionary<string, string>Levels
- Driver
Log map[string]stringLevels
- driver
Log {[key: string]: string}Levels
- driver
Log Dict[str, str]Levels
JobPlacement
JobPysparkConfig
- Main
Python stringFile Uri The HCFS URI of the main Python file to use as the driver. Must be a .py file.
- Archive
Uris List<string> HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.
- Args List<string>
The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
- File
Uris List<string> HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
- Jar
File List<string>Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- Logging
Config JobPyspark Config Logging Config Args - Properties Dictionary<string, string>
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- Python
File List<string>Uris HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
- Main
Python stringFile Uri The HCFS URI of the main Python file to use as the driver. Must be a .py file.
- Archive
Uris []string HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.
- Args []string
The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
- File
Uris []string HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
- Jar
File []stringUris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- Logging
Config JobPyspark Config Logging Config - Properties map[string]string
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- Python
File []stringUris HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
- main
Python stringFile Uri The HCFS URI of the main Python file to use as the driver. Must be a .py file.
- archive
Uris string[] HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.
- args string[]
The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
- file
Uris string[] HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
- jar
File string[]Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- logging
Config JobPyspark Config Logging Config - properties {[key: string]: string}
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- python
File string[]Uris HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
- main
Python strFile Uri The HCFS URI of the main Python file to use as the driver. Must be a .py file.
- archive
Uris List[str] HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.
- args List[str]
The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
- file
Uris List[str] HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
- jar
File List[str]Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- logging
Config Dict[JobPyspark Config Logging Config] - properties Dict[str, str]
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- python
File List[str]Uris HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.
JobPysparkConfigLoggingConfig
- Driver
Log Dictionary<string, string>Levels
- Driver
Log map[string]stringLevels
- driver
Log {[key: string]: string}Levels
- driver
Log Dict[str, str]Levels
JobReference
JobScheduling
JobSparkConfig
- Archive
Uris List<string> HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.
- Args List<string>
The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
- File
Uris List<string> HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
- Jar
File List<string>Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- Logging
Config JobSpark Config Logging Config Args - Main
Class string The name of the driver’s main class. The jar file containing the class must be in the default CLASSPATH or specified in
jar_file_uris. Conflicts withmain_jar_file_uri- Main
Jar stringFile Uri The HCFS URI of the jar file containing the main class. Examples: ‘gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar’ ‘hdfs:/tmp/test-samples/custom-wordcount.jar’ ‘file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar’. Conflicts with
main_class- Properties Dictionary<string, string>
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- Archive
Uris []string HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.
- Args []string
The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
- File
Uris []string HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
- Jar
File []stringUris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- Logging
Config JobSpark Config Logging Config - Main
Class string The name of the driver’s main class. The jar file containing the class must be in the default CLASSPATH or specified in
jar_file_uris. Conflicts withmain_jar_file_uri- Main
Jar stringFile Uri The HCFS URI of the jar file containing the main class. Examples: ‘gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar’ ‘hdfs:/tmp/test-samples/custom-wordcount.jar’ ‘file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar’. Conflicts with
main_class- Properties map[string]string
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- archive
Uris string[] HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.
- args string[]
The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
- file
Uris string[] HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
- jar
File string[]Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- logging
Config JobSpark Config Logging Config - main
Class string The name of the driver’s main class. The jar file containing the class must be in the default CLASSPATH or specified in
jar_file_uris. Conflicts withmain_jar_file_uri- main
Jar stringFile Uri The HCFS URI of the jar file containing the main class. Examples: ‘gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar’ ‘hdfs:/tmp/test-samples/custom-wordcount.jar’ ‘file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar’. Conflicts with
main_class- properties {[key: string]: string}
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- archive
Uris List[str] HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.
- args List[str]
The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.
- file
Uris List[str] HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.
- jar
File List[str]Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- logging
Config Dict[JobSpark Config Logging Config] - main
Class str The name of the driver’s main class. The jar file containing the class must be in the default CLASSPATH or specified in
jar_file_uris. Conflicts withmain_jar_file_uri- main
Jar strFile Uri The HCFS URI of the jar file containing the main class. Examples: ‘gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar’ ‘hdfs:/tmp/test-samples/custom-wordcount.jar’ ‘file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar’. Conflicts with
main_class- properties Dict[str, str]
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
JobSparkConfigLoggingConfig
- Driver
Log Dictionary<string, string>Levels
- Driver
Log map[string]stringLevels
- driver
Log {[key: string]: string}Levels
- driver
Log Dict[str, str]Levels
JobSparksqlConfig
- Jar
File List<string>Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- Logging
Config JobSparksql Config Logging Config Args - Properties Dictionary<string, string>
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- Query
File stringUri The HCFS URI of the script that contains SQL queries. Conflicts with
query_list- Query
Lists List<string> The list of SQL queries or statements to execute as part of the job. Conflicts with
query_file_uri- Script
Variables Dictionary<string, string> Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).
- Jar
File []stringUris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- Logging
Config JobSparksql Config Logging Config - Properties map[string]string
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- Query
File stringUri The HCFS URI of the script that contains SQL queries. Conflicts with
query_list- Query
Lists []string The list of SQL queries or statements to execute as part of the job. Conflicts with
query_file_uri- Script
Variables map[string]string Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).
- jar
File string[]Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- logging
Config JobSparksql Config Logging Config - properties {[key: string]: string}
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- query
File stringUri The HCFS URI of the script that contains SQL queries. Conflicts with
query_list- query
Lists string[] The list of SQL queries or statements to execute as part of the job. Conflicts with
query_file_uri- script
Variables {[key: string]: string} Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).
- jar
File List[str]Uris HCFS URIs of jar files to be added to the Spark CLASSPATH.
- logging
Config Dict[JobSparksql Config Logging Config] - properties Dict[str, str]
A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.
- query
File strUri The HCFS URI of the script that contains SQL queries. Conflicts with
query_list- query
Lists List[str] The list of SQL queries or statements to execute as part of the job. Conflicts with
query_file_uri- script
Variables Dict[str, str] Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).
JobSparksqlConfigLoggingConfig
- Driver
Log Dictionary<string, string>Levels
- Driver
Log map[string]stringLevels
- driver
Log {[key: string]: string}Levels
- driver
Log Dict[str, str]Levels
JobStatus
See the output API doc for this type.
See the output API doc for this type.
See the output API doc for this type.
Package Details
- Repository
- https://github.com/pulumi/pulumi-gcp
- License
- Apache-2.0
- Notes
- This Pulumi package is based on the
google-betaTerraform Provider.