| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.mapreduce.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.hadoop.avro.Mutation>>
org.apache.cassandra.hadoop.ColumnFamilyOutputFormat
public class ColumnFamilyOutputFormat
The ColumnFamilyOutputFormat acts as a Hadoop-specific
 OutputFormat that allows reduce tasks to store keys (and corresponding
 values) as Cassandra rows (and respective columns) in a given
 ColumnFamily.
 
 
 As is the case with the ColumnFamilyInputFormat, you need to set the
 Keyspace and ColumnFamily in your
 Hadoop job Configuration. The ConfigHelper class, through its
 ConfigHelper.setOutputColumnFamily(org.apache.hadoop.conf.Configuration, java.lang.String, java.lang.String) method, is provided to make this
 simple.
 
For the sake of performance, this class employs a lazy write-back caching mechanism, where its record writer batches mutations created based on the reduce's inputs (in a task-specific map), and periodically makes the changes official by sending a batch mutate request to Cassandra.
| Nested Class Summary | |
|---|---|
| static class | ColumnFamilyOutputFormat.NullOutputCommitterAn OutputCommitterthat does nothing. | 
| Field Summary | |
|---|---|
| static java.lang.String | BATCH_THRESHOLD | 
| static java.lang.String | QUEUE_SIZE | 
| Constructor Summary | |
|---|---|
| ColumnFamilyOutputFormat() | |
| Method Summary | |
|---|---|
|  void | checkOutputSpecs(org.apache.hadoop.fs.FileSystem filesystem,
                 org.apache.hadoop.mapred.JobConf job)Deprecated. | 
|  void | checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext context)Check for validity of the output-specification for the job. | 
| static Cassandra.Client | createAuthenticatedClient(org.apache.thrift.transport.TSocket socket,
                          org.apache.hadoop.conf.Configuration conf)Return a client based on the given socket that points to the configured keyspace, and is logged in with the configured credentials. | 
|  org.apache.hadoop.mapreduce.OutputCommitter | getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)The OutputCommitter for this format does not write any data to the DFS. | 
|  org.apache.cassandra.hadoop.ColumnFamilyRecordWriter | getRecordWriter(org.apache.hadoop.fs.FileSystem filesystem,
                org.apache.hadoop.mapred.JobConf job,
                java.lang.String name,
                org.apache.hadoop.util.Progressable progress)Deprecated. | 
|  org.apache.cassandra.hadoop.ColumnFamilyRecordWriter | getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)Get the RecordWriterfor the given task. | 
| Methods inherited from class java.lang.Object | 
|---|
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Field Detail | 
|---|
public static final java.lang.String BATCH_THRESHOLD
public static final java.lang.String QUEUE_SIZE
| Constructor Detail | 
|---|
public ColumnFamilyOutputFormat()
| Method Detail | 
|---|
public void checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext context)
checkOutputSpecs in class org.apache.hadoop.mapreduce.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.hadoop.avro.Mutation>>context - information about the job
java.io.IOException - when output should not be attempted
public org.apache.hadoop.mapreduce.OutputCommitter getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                               throws java.io.IOException,
                                                                      java.lang.InterruptedException
getOutputCommitter in class org.apache.hadoop.mapreduce.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.hadoop.avro.Mutation>>context - the task context
java.io.IOException
java.lang.InterruptedException
@Deprecated
public void checkOutputSpecs(org.apache.hadoop.fs.FileSystem filesystem,
                                        org.apache.hadoop.mapred.JobConf job)
                      throws java.io.IOException
checkOutputSpecs in interface org.apache.hadoop.mapred.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.hadoop.avro.Mutation>>job - job configuration.
java.io.IOException - when output should not be attempted
@Deprecated
public org.apache.cassandra.hadoop.ColumnFamilyRecordWriter getRecordWriter(org.apache.hadoop.fs.FileSystem filesystem,
                                                                                       org.apache.hadoop.mapred.JobConf job,
                                                                                       java.lang.String name,
                                                                                       org.apache.hadoop.util.Progressable progress)
                                                                     throws java.io.IOException
getRecordWriter in interface org.apache.hadoop.mapred.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.hadoop.avro.Mutation>>job - configuration for the job whose output is being written.name - the unique name for this part of the output.progress - mechanism for reporting progress while writing to file.
RecordWriter to write the output for the job.
java.io.IOException
public org.apache.cassandra.hadoop.ColumnFamilyRecordWriter getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                     throws java.io.IOException,
                                                                            java.lang.InterruptedException
RecordWriter for the given task.
getRecordWriter in class org.apache.hadoop.mapreduce.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.hadoop.avro.Mutation>>context - the information about the current task.
RecordWriter to write the output for the job.
java.io.IOException
java.lang.InterruptedException
public static Cassandra.Client createAuthenticatedClient(org.apache.thrift.transport.TSocket socket,
                                                         org.apache.hadoop.conf.Configuration conf)
                                                  throws InvalidRequestException,
                                                         org.apache.thrift.TException,
                                                         AuthenticationException,
                                                         AuthorizationException
socket - a socket pointing to a particular node, seed or otherwiseconf - a job configuration
InvalidRequestException
org.apache.thrift.TException
AuthenticationException
AuthorizationException| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||