|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.mapreduce.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.avro.Mutation>>
org.apache.cassandra.hadoop.ColumnFamilyOutputFormat
public class ColumnFamilyOutputFormat
The ColumnFamilyOutputFormat acts as a Hadoop-specific
OutputFormat that allows reduce tasks to store keys (and corresponding
values) as Cassandra rows (and respective columns) in a given
ColumnFamily.
As is the case with the ColumnFamilyInputFormat, you need to set the
Keyspace and ColumnFamily in your
Hadoop job Configuration. The ConfigHelper class, through its
ConfigHelper.setOutputColumnFamily(org.apache.hadoop.conf.Configuration, java.lang.String, java.lang.String) method, is provided to make this
simple.
For the sake of performance, this class employs a lazy write-back caching mechanism, where its record writer batches mutations created based on the reduce's inputs (in a task-specific map), and periodically makes the changes official by sending a batch mutate request to Cassandra.
| Nested Class Summary | |
|---|---|
static class |
ColumnFamilyOutputFormat.NullOutputCommitter
An OutputCommitter that does nothing. |
| Field Summary | |
|---|---|
static java.lang.String |
BATCH_THRESHOLD
|
static java.lang.String |
QUEUE_SIZE
|
| Constructor Summary | |
|---|---|
ColumnFamilyOutputFormat()
|
|
| Method Summary | |
|---|---|
void |
checkOutputSpecs(org.apache.hadoop.fs.FileSystem filesystem,
org.apache.hadoop.mapred.JobConf job)
Deprecated. |
void |
checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext context)
Check for validity of the output-specification for the job. |
static Cassandra.Client |
createAuthenticatedClient(org.apache.thrift.transport.TSocket socket,
org.apache.hadoop.conf.Configuration conf)
Return a client based on the given socket that points to the configured keyspace, and is logged in with the configured credentials. |
org.apache.hadoop.mapreduce.OutputCommitter |
getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
The OutputCommitter for this format does not write any data to the DFS. |
org.apache.cassandra.hadoop.ColumnFamilyRecordWriter |
getRecordWriter(org.apache.hadoop.fs.FileSystem filesystem,
org.apache.hadoop.mapred.JobConf job,
java.lang.String name,
org.apache.hadoop.util.Progressable progress)
Deprecated. |
org.apache.cassandra.hadoop.ColumnFamilyRecordWriter |
getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
Get the RecordWriter for the given task. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final java.lang.String BATCH_THRESHOLD
public static final java.lang.String QUEUE_SIZE
| Constructor Detail |
|---|
public ColumnFamilyOutputFormat()
| Method Detail |
|---|
public void checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext context)
checkOutputSpecs in class org.apache.hadoop.mapreduce.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.avro.Mutation>>context - information about the job
java.io.IOException - when output should not be attempted
public org.apache.hadoop.mapreduce.OutputCommitter getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
throws java.io.IOException,
java.lang.InterruptedException
getOutputCommitter in class org.apache.hadoop.mapreduce.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.avro.Mutation>>context - the task context
java.io.IOException
java.lang.InterruptedException
@Deprecated
public void checkOutputSpecs(org.apache.hadoop.fs.FileSystem filesystem,
org.apache.hadoop.mapred.JobConf job)
throws java.io.IOException
checkOutputSpecs in interface org.apache.hadoop.mapred.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.avro.Mutation>>job - job configuration.
java.io.IOException - when output should not be attempted
@Deprecated
public org.apache.cassandra.hadoop.ColumnFamilyRecordWriter getRecordWriter(org.apache.hadoop.fs.FileSystem filesystem,
org.apache.hadoop.mapred.JobConf job,
java.lang.String name,
org.apache.hadoop.util.Progressable progress)
throws java.io.IOException
getRecordWriter in interface org.apache.hadoop.mapred.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.avro.Mutation>>job - configuration for the job whose output is being written.name - the unique name for this part of the output.progress - mechanism for reporting progress while writing to file.
RecordWriter to write the output for the job.
java.io.IOException
public org.apache.cassandra.hadoop.ColumnFamilyRecordWriter getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
throws java.io.IOException,
java.lang.InterruptedException
RecordWriter for the given task.
getRecordWriter in class org.apache.hadoop.mapreduce.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.avro.Mutation>>context - the information about the current task.
RecordWriter to write the output for the job.
java.io.IOException
java.lang.InterruptedException
public static Cassandra.Client createAuthenticatedClient(org.apache.thrift.transport.TSocket socket,
org.apache.hadoop.conf.Configuration conf)
throws InvalidRequestException,
org.apache.thrift.TException,
AuthenticationException,
AuthorizationException
socket - a socket pointing to a particular node, seed or otherwiseconf - a job configuration
InvalidRequestException
org.apache.thrift.TException
AuthenticationException
AuthorizationException
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||