We have covered installing and managing Endeca Server, and now we will wrap up this blog with an overview of the internals of Endeca Server and how Endeca Server fits into enterprise architecture with other enterprise products.
EndecaServer.properties file contains settings for Endeca Server and can be thought of in the context of init or .ini files. Endeca Server administrators use the EndecaServer.properties file to change directory locations for the following:
- Index files for data domain. This location is common for all data domains on a deployment of Endeca Server.
- Log file directory, where the Endeca Server logs are written.
- The “offline directory,” where files from endeca-cmd export-dd are stored.
- Files associated with the Cluster Coordinator, for clustered operations.
- Files associated with the data enrichment plug-in.
Endeca Server administrators will often change these directories to ensure that operating system files and the Endeca Server index files are not on the same logical volume. This ensures that I/O associated with operating system activities does not contend with the data domain index file I/O operations.
EndecaServer.properties file also contains tuning parameters that can be used to modify the performance characteristics of Endeca Server. These settings are as follows:
As a general rule, you should modify these settings only after considerable empirical testing in a test environment and only after working closely with Oracle Endeca Support. Oracle Endeca Support can be engaged by submitting service requests on the Oracle Support Portal.
Settings for control groups are also managed in the file EndecaServer.properties. Recall the control groups are available only if Linux version 6 is used for Endeca Server. The shell script provided with Endeca Server, setup_cgroups.sh, must be run in order to enable control groups.
As we discussed in this blog, EQL is the query language for Endeca. It has some similarities to SQL, as well as a number of differences. Many Oracle users are accustomed to SQLPlus, a tool that allows queries to be executed from a command-line interface, objects in the database to be managed, and the database itself to be managed and controlled. Endeca Server does not provide a command-line interface for running queries; all queries must be processed via the Conversation web service. It is possible to send a query to Endeca Server via a web browser to the Conversation web service, and the Oracle Endeca documentation provides an example of this. For EQL query development, you should use Endeca Integrator ETL or Endeca Studio. EQL is strictly for query execution. The management of Endeca Server is accomplished by using endeca-cmd and by modifying settings in the EndecaServer.properties file. Moreover, EQL is not used to create or manage objects and does not support schema declaration.
Oracle Endeca Server resource management, as well as some of the vehicles available for resource management. Endeca Server allows the creation of a data domain on a running system only if there are sufficient memory resources for the data domain. If there are insufficient memory resources, then data domain creation is denied. In the prior blog “Data Domain Management”, we discussed these data domain resource allocation parameters:
Endeca Server uses these settings to determine whether it has sufficient capacity to host a data domain. In addition to these settings, Endeca Server maintains and uses historic memory allocation data to determine whether a data domain can be created. Endeca Server administrators should monitor data domain usage and manage data domain operations with the data domain resource allocation parameters. This also involves understanding how a data domain will be used. For example, if the data sets in a data domain will remain static, then it should be set to read-only to increase performance and conserve system resources. It makes sense to use oversubscribe when on ly a subset of data domains on an Endeca Server deployment will be used during any given time. The auto-idle setting is analogous to the energy-saving features found on most personal computers. When a resource is not needed, it is turned off to conserve power. When a data domain is not being used, its Dgraph processes should terminate with auto-idle, allowing the resources they would be consuming to be available to other data domains.
Control groups, available only on Oracle Enterprise Linux 6 with the Unbreakable Enterprise Kernel, introduce Linux kernel–level memory management. A cgroup is an OS-level container for memory, and all memory for Dgraph processes is allocated out of the cgroup. This disallows the Dgraph processes from using all server memory resources for data domains and reserves memory for normal server operations, even when memory utilization for data domains is at capacity. This allows Endeca Server administrators to log in to the server and perform management tasks to review log files, use OS-level monitoring features, and shut down or restart stalled processes.
Planning activities for Endeca Server should also be part of managing resources. During normal business hours, when many users are using Endeca Studio, data set ingests with Integrator ETL Studio should be avoided or coordinated. Backups using the export feature can occur asynchronously with other data domain usage. However, it is usually best to perform these at times when the system is more lightly loaded, such as just prior to the start of business, at lunch, or just after the close of business. For installations where graphs will be frequently executed, Endeca ETL Server can be used to schedule resource-intensive ingests to occur after hours when there are no users on the system.