Scylla Documentation Logo Documentation
  • Server
    • Scylla Open Source
    • Scylla Enterprise
    • Scylla Alternator
  • Cloud
    • Scylla Cloud
    • Scylla Cloud Docs
  • Tools
    • Scylla Manager
    • Scylla Monitoring Stack
    • Scylla Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
Download
Menu

Caution

You're viewing documentation for a previous version of Scylla Java Driver. Switch to the latest stable version.

Scylla Java Driver Manual Metadata

Metadata¶

The driver maintains global information about the Cassandra cluster it is connected to. It is available via Cluster#getMetadata().

Schema metadata¶

Use getKeyspace(String) or getKeyspaces() to get keyspace-level metadata. From there you can access the keyspace’s objects (tables, and UDTs and UDFs if relevant).

Refreshes¶

Schema metadata gets refreshed in the following circumstances:

  • schema changes via the driver: after successfully executing a schema-altering query (ex: CREATE TABLE), the driver waits for schema agreement (see below), then refreshes the schema.

  • third-party schema changes: if another client (cqlsh, other driver instance…) changes the schema, the driver gets notified by Cassandra via a push notification. It refreshes the schema directly (there is no need to wait for schema agreement since Cassandra has already done it).

Subscribing to schema changes¶

Users interested in being notified of schema changes can implement the SchemaChangeListener interface.

Every listener must be registered against a Cluster instance:

Cluster cluster = ...
SchemaChangeListener myListener = ...
cluster.register(myListener);

Once registered, the listener will be notified of all schema changes detected by the driver, regardless of where they originate from.

Note that it is preferable to register a listener only after the cluster is fully initialized, otherwise the listener could be notified with a great deal of “Added” events as the driver builds the schema metadata from scratch for the first time.

Schema agreement¶

Schema changes need to be propagated to all nodes in the cluster. Once they have settled on a common version, we say that they are in agreement.

As explained above, the driver waits for schema agreement after executing a schema-altering query. This is to ensure that subsequent requests (which might get routed to different nodes) see an up-to-date version of the schema.

 Application             Driver           Cassandra
------+--------------------+------------------+-----
      |                    |                  |
      |  CREATE TABLE...   |                  |
      |------------------->|                  |
      |                    |   send request   |
      |                    |----------------->|
      |                    |                  |
      |                    |     success      |
      |                    |<-----------------|
      |                    |                  |
      |          /--------------------\       |
      |          :Wait until all nodes+------>|
      |          :agree (or timeout)  :       |
      |          \--------------------/       |
      |                    |        ^         |
      |                    |        |         |
      |                    |        +---------|
      |                    |                  |
      |                    |  refresh schema  |
      |                    |----------------->|
      |                    |<-----------------|
      |   complete query   |                  |
      |<-------------------|                  |
      |                    |                  |

The schema agreement wait is performed synchronously, so the execute call – or the completion of the ResultSetFuture if you use the async API – will only return after it has completed.

The check is implemented by repeatedly querying system tables for the schema version reported by each node, until they all converge to the same value. If that doesn’t happen within a given timeout, the driver will give up waiting. The default timeout is 10 seconds, it can be customized when building your cluster:

Cluster cluster = Cluster.builder()
    .addContactPoint("127.0.0.1")
    .withMaxSchemaAgreementWaitSeconds(20)
    .build();

After executing a statement, you can check whether schema agreement was successful or timed out:

ResultSet rs = session.execute("CREATE TABLE...");
if (rs.getExecutionInfo().isSchemaInAgreement()) {
    // schema is in agreement
} else {
    // schema agreement timed out
}

You can also perform an on-demand check at any time:

if (cluster.getMetadata().checkSchemaAgreement()) {
    // schema is in agreement
} else {
    // schema is not in agreement
}

The on-demand check does not retry, it only queries system tables once (so maxSchemaAgreementWaitSeconds does not apply). If you need retries, you’ll have to schedule them yourself (for example with a custom executor).

Check out the API docs for the features in this section:

  • withMaxSchemaAgreementWaitSeconds(int)

  • isSchemaInAgreement()

  • checkSchemaAgreement()

Token metadata¶

This feature is probably of less interest to regular driver users, but it will be useful if you’re writing an analytics client on top of the driver.

Metadata exposes a number of methods to manipulate tokens and ranges: getTokenRanges(), getTokenRanges(String keyspace, Host host), getReplicas(String keyspace, TokenRange range), newToken(String) and newTokenRange(Token start, Token end).

TokenRange provides various operations on ranges (splitting, merging, etc.).

Each host exposes its primary tokens as getTokens().

Finally, you can inject tokens in CQL queries with BoundStatement#setToken, and retrieve them from results with Row#getToken and Row#getPartitionKeyToken.

As an example, here is how you could compute the splits to partition a job (pseudocode):

metadata = cluster.getMetadata()
for range : metadata.getTokenRanges() {
    hosts = metadata.getReplicas(keyspace, range)
    int n = estimateNumberOfSplits() // more on that below
    for split : range.splitEvenly(n)
        // pick a host to process split
}

For estimateNumberOfSplits, you need a way to estimate the total number of partition keys (this is what analytics clients would traditionally do with the Thrift operation describe_splits_ex). Starting with Cassandra 2.1.5, this information is available in a system table (see CASSANDRA-7688).

PREVIOUS
Logging
NEXT
Metrics
  • 3.7.2.x
    • 4.13.0.x
    • 4.12.0.x
    • 4.11.1.x
    • 4.10.0.x
    • 4.7.2.x
    • 3.11.2.x
    • 3.11.0.x
    • 3.10.2.x
    • 3.7.2.x
  • Scylla Java Driver for Scylla and Apache Cassandra®
  • API Documentation
  • Manual
    • Address resolution
    • Asynchronous programming
    • Authentication
    • Compression
    • Control connection
    • Custom Codecs
    • Custom Payloads
    • Query idempotence
    • Load balancing
    • Logging
    • Metadata
    • Metrics
    • Native protocol
    • Object Mapper
      • Definition of mapped classes
      • Using custom codecs
      • Using the mapper
    • OSGi
    • Paging
    • Connection pooling
    • Query timestamps
    • Reconnection
    • Retries
    • Using the shaded JAR
    • Socket options
    • Speculative query execution
    • SSL
    • Statements
      • Simple statements
      • Prepared statements
      • Built statements
      • Batch statements
    • Using Tuples with the Java driver
    • User-defined types
  • Upgrade guide
    • Migrating from Astyanax
      • Configuration
      • Language change : from Thrift to CQL
      • Queries and Results
  • Frequently Asked Questions
  • Changelog
  • Create an issue
  • Edit this page

On this page

  • Metadata
    • Schema metadata
      • Refreshes
      • Subscribing to schema changes
      • Schema agreement
    • Token metadata
Logo
Docs Contact Us About Us
Mail List Icon Slack Icon
© 2022, ScyllaDB. All rights reserved.
Last updated on 25 May 2022.
Powered by Sphinx 4.3.2 & ScyllaDB Theme 1.2.2