The Index tab in the System -> Maintenance tool allows you to manage indexes for single or clustered dotCMS instances.
Index Details
The detail area displays the following information for each index:
Field | Description |
---|---|
Status | Status of the index (which working/live indexes are active). |
Alias Name | Identifier of the index, and whether the index is live or working. |
Created | Creation timestamp of the index. |
Count | Number of objects in the index. |
Shards | Number of Elasticsearch shards (underlying sub-indexes) in the index. |
Replicas* | Number of copies of a given index (*only available on clustered instances). |
Size | Size of the index (in Megabytes). |
Health | Colored icon indicating whether the index or index “replica” is being used by a dotCMS instance:
|
Elasticsearch Shards
When you create a new index, you may specify the number of shards in the index. Elasticsearch abstracts the index so that several “shards” (sub-indexes) can aggregate results when a query is performed. This makes multiple separate shards behave like a single index, but enhances performance and scaleability because ElasticSearch only updates the shard(s) that it needs instead of updating the whole index every time.
Shard Performance
Shards provide a trade-off in performance. As the number of shards increases the process of updating the index gets faster, but performing queries against the index may become slightly slower (as multiple shards may need to be accessed to perform the query, especially if it's a complex query).
General Considerations:
- If you have a site with a large database and/or frequent content updates, you may want to consider increasing the number of shards to reduce the time it takes to re-index content.
- If you have a site with a great deal of front-end traffic, you may want to minimize the number of shards to maximize query performance.
- Achieving the right balance requires a needs assessment and follow up testing.
Actions Available on Indexes
The following right-click options are available on any index:
Action | Index Status | Description |
---|---|---|
Clear Index | Active / Inactive | Clears the index (to prepare for a rebuild). |
Deactivate Index | Active | Deactivates an index. Index retains system resources and remains available for actions such as clearing. |
Activate Index | Inactive | Reactivates a deactivated index. |
Close-Index | Inactive | Closes an index, blocking read/write operations and caching and freeing up system resources, so it has nearly no overhead on the server. |
Open-Index | Closed | Re-opens a closed index, but leaving it in a deactivated state. |
Delete Index | Inactive / Closed | Removes an index from the server; this cannot be reversed. |
Note: Clearing or de-activating a live index will display a popup warning message that site visibility may be affected.
- However when you are troubleshooting potential index issues, these actions allow you to “clean” an index before restoring a copy of that index; this provides a much faster and lighter option than a complete site re-index to test or resolve an issue.
Adding an Index
To build an index, select a content option from the dropdown appearing at the top of the Content Index Tasks subsection and click the Reindex button. The dropdown defaults to Rebuild Whole Index, which both includes all content, and guarantees the creation of new indexes. Selecting any other choice will instead perform reindexing within the currently active index; this process is conducted in the background, updating the index's count and size after completing.
You will be prompted to indicate the desired number of shards, and then the process will begin.
Important Note: Indexes do take up memory space. Unused/old indexes should be removed using the “Delete Index” right-click option.
Index Location
Unless configured otherwise, index shards are stored in the /dotsecure/esdata/
directory from the root of dotCMS (e.g. /dotserver/tomcat-X.x.xx/webapps/ROOT/dotsecure/esdata
in the default dotCMS distribution).
However Elasticsearch allows you to store each shard in a different location, enabling you to distribute shards in separate folders or on separate disks if desired.
To configure a different location for your Elasticsearch indexes, you must modify the DYNAMIC_CONTENT_PATH
configuration property.
Note: It is strongly recommended that all changes to configuration properties be made through environment variables.
Index Replicas (For Cluster Implementations Only)
Special Note:
By default, the Auto-clustering feature handles the replicas for Elasticsearch. The default configuration setting ES_INDEX_REPLICAS=autowire
. The AUTOWIRE_CLUSTER_ES
is set to true as well. The ES_INDEX_REPLICAS
property may also be set to 0-all
(or any other boundary. 0-8
, etc). The lower to upper integer boundary can auto-expand/auto-contract the number of index replicas between the lower & upper boundary based on the ElasticSearch cluster topology. This will enable automated management of ES replicas as long as an dotCMS Enterprise license is also present on the server.
However, setting the ES_INDEX_REPLICAS
property to a specific integer, enables backend UI manual management of clustered elastic search replicas as long as the AUTOWIRE_CLUSTER_ES
is set to false. This fixes a static number of ES replicas.
For more information on how configure the management of replicas, please see the Auto Clustering Configuration documentation
When using a cluster, you can create replicas of an index to distribute and mirror the index across multiple servers in the cluster. To change the number of replicas of an index, right-click on the index and select Update Number of Replicas.
Managing Indexes via the API
Index management actions that can be performed from the dotCMS backend can also be achieved using the REST API (including via CURL commands). For more information, see the RESTful API to Manage Indexes documentation.
Note on Number of Replicas
For more information on how configure the management of replicas, please see the Auto Clustering Configuration documentation
Setting the proper number of replicas for dotCMS's ElasticSearch index can be confusing. It is important to understand that the number of replicas does not equal the number of servers in your cluster.
The number of replicas is how many times you want your index to be copied. For example, if you are running a two node cluster then you should have your ElasticSearch replicas set to “1”. This means that there is the original index entry on one server and 1 replica, or copy, on the other server, so both servers have a copy of all index entries.
Therefore, a guideline for the proper number of replicas is:
- For clusters with less than 5 servers: Set replicas to one less than the number of servers in the cluster. Examples:
- 2 servers: 1 replica
- 4 servers: 3 replicas
- For clusters with 5 or more servers: Set replicas to 1/2 the number of servers in the cluster (rounded up).
- 5 servers: 3 replicas
- 8 servers: 4 replicas
When a new server is joined to the cluster (see cluster doc), Elasticsearch automatically recognizes the server and begins replicating to it. When a server is removed from the cluster or goes offline, the replicated index may display a “yellow” or inactive status if the number of nodes configured in the cluster does not at least match the number of replicated indexes.
The Elasticsearch index may also be configured to run as a “stand-alone” Elasticsearch server and connect to each node in a dotCMS cluster.