This commit updates how template delta disk creation is evaluated. Without this change template deltas are created for every template on each applicatoin startup. This change updates this behavior to instead run template delta disk creation only once per template configured for a pool. Without this change it is possible to get a template to a state where the XML depth is too great to be read with default settings and the template requires a new clone to resolve.
This commit removes a additional authenticate method that is defined in the token_spec tests. Instead, authenticate is used from api/helpers. To support this change the config provided is updated to specify a dummy provider. Without this change authenticate cannot be tested along with token_spec because token_spec redefines authenticate.
This commit updates vmpooler to support setting an array of search bases
in addition to a single base provided as a string. Without this change
it is not possible to specify multiple search bases to use with the LDAP
authentication provider. Additionally, test coverage is added to
the authentication helper method.
This commit implements a vm_mutex hash to allow synchronizing VM operations that should only happen once across threads. Without this change pool_manager will try to evaluate or destroy a VM multiple times, which results in an error being thrown by one of the destroy attempts as only one can succeed and a duplication of resources unnecessarily when there are no errors.
This commit adds a configuration endpoint to the vmpooler API. Pool
size, and pool template, can be adjusted for pools that are configured
at vmpooler application start time. Pool template changes trigger a pool
refresh, and the new template has delta disks created automatically by
vmpooler.
Additionally, the capability to create template delta disks is added to
the vsphere provider, and this is implemented to ensure that templates
have delta disks created at application start time.
The mechanism used to find template VM objects is simplified to make the flow of logic easier to understand. As an additional benefit, performance of this lookup is improved by using FindByInventoryPath.
A table of contents is added to API.md to ease navigation. Without this change API.md has no table of contents and is difficult to navigate.
Add mutex object for managing pool configuration updates
This commit adds a mutex object for ensuring that pool configuration changes are synchronized across multiple running threads, removing the possibility of two threads attempting to update something at once, without relying on redis data. Without this change this is managed crudely by specifying in redis that a configuration update is taking place. This redis data is left so the REPOPULATE section of _check_pool can still identify when a configuration change is in progress, and prevent a pool from repopulating at that time.
Add wake up event for pool template changes
This commit adds a wake up event to detect pool template changes.
Additionally, GET /config has a template_ready section added to the
output for each pool, which makes clear when a pool is ready to populate
itself.
This commit updates add_disk to remove propertyCollector, which was used
to back the find_vmdks method to locate the disk file on datastore and
then use its length to name the new disk. Instead, the number of disks
on the VM is used to ensure a unique disk resource title. Without this
change add_disk can take 10-50x longer due to the propertyCollector
method. Additionally, without this change propertyCollector is used in a
non threadsafe manner, which may cause stability issues for vsphere
provider backends.
This commit replaces find_vm and find_vm_heavy with a more performant and reliable mechanism of identifying VM objects. Specifically, FindByInventoryPath is able to leverage known data about a VM, its folder path and datacenter, and use that to identify whether that VM exists by its location. Without this change find_vm_heavy is called each time a VM cannot be found, which is frequent, and in doing so uses PropertyCollector in a manner that is not thread-safe. Additionally, this PropertyCollector usage does not clean up its traces, which can cause vCenter appliance instability issues on VCSA 6.x.
This commit add a redis hash where there is one key per pool, and the
stored value is the last time a VM was booted e.g. the last time
a VM went from 'pending' to 'ready'. This is also displayed in the
API as lastBoot:'2018-03-23 17:43:39 +0000'. The data can then be
used by any external system, in this case our alarming system.
variable
Previously, there were two ways to configure Vmpooler, either by
changing the contents of vmpooler.yaml or by assigning the raw YAML
to the VMPOOLER_CONFIG environment variable. This commit adds a new
environment variable called VMPOOLER_CONFIG_FILE that can be assigned
the name of a config file to use. Also fixes#240 by whitelisting the
Symbol class when calling YAML.safe_load in Vmpooler.config.
This commit updates cpu_utilization_for and memory_utilization_for to detect when quickstats are not present. Without this change a nil result is transformed to 0, which is perceived as a host that has no utilization.
This commit moves the migrate_vm logic to the vsphere provider. Without
this change migrate_vm has lots of vsphere specific logic in
pool_manager migrate_vm method.
The status endpoint provides a lot of statistics. This commit extends it
by supporting a query parameter called 'view' which may contain one or
multiple comma separated names for the top-level statistics returned
in the JSON response. status is always returned.
Optional elements are capacity,queue,clone,boot,pools
Everything is returned when 'view' is not specified, which is
backwards compatible with the current behavior.
Before this change if a pool had an alias configured, the information would not be
made public in the API. This commit adds the alias key in the pool object for each
pool if configured. The alias key can be abscent, a string or an one or multiple
array of strings. The value of the alias is copied from the configuration and can
represent another name for the pool, or another configured pool.
* Fix no implicit conversion to rational from nil
Before this change if the boottime was nil, the check_ready
loop would exit on Time.now - host['boottime'] with a TypeError
in jruby. The boottime is nil when the power is Off so moving that check
earlier should catch that bug.
* set test data properly
Previously in commit 9b0e55f959 the looping period was changed from a static
number to a dynamic one depending on load, however this meant that the operation
to refill a pool was slowed down somewhat. While not a problem under normal
loads, when a pool was quickly consumed, the pool manager may not respond
quickly enough to refill the pool. This commit:
- Changes the sleep method, to us a helper sleep method that will wakeup
periodically and evaluate other wakeup events. This could be used later to
exist sleep loops when pooler is shutting down to stop blocking threads
- By default the wakeup_period is set to the minimum pool check loop time, thus
emulating the behaviour prior to commit 9b0e55f959
- Adds tests for the behaviour
This commit adds vmpooler inspection of configuration issues to host
selection. Specifically, configIssue is checked, which should allow an
issue like quickstats not being reported to be identified even when the
alarm will not trigger. Without this change a host will continue to be
used if quickstats are not reported when alarms are not triggered
because of this condition, which results in a single host being the
target for all deploys and migrations, overloading the host and causing
its VMs to have degraded performance.
Previously, if inventory failed for some reason, it would return an incomplete
set of VMs which could then cause the pool to perform off behaviours such as
fill the pool high than it should, or remove VMs which exist. Also, if the
redis cache of VMs in a pool had a VM but it did not actually exist in the
inventory it would never be removed.
This commit:
- Immediately exits the check_pool if an error occurs during inventory
collection
- Will mark a VM as completed if it exists in Redis, but does not exist in
inventory
- Adds tests for these behaviours
Sometimes this test would fail if the computer running the tests was under a
bit of load. This commit changes the expected output to be up to 1.99 seconds
instead of the previous 0.99 seconds.
Previously the check_pool would always check the pool every 5 seconds, however
with a large number of pools, this can cause resource issues inside the
providers. This commit:
- Introduces a dynamic check_pool period which increases during stability and
decreases when the pool is being change in an important way
- Surfaces the settings as global config defaults but can also be set on a per
pool basis
- Adds defaults to emulate the current behaviour
- Unit tests for the new behaviour
This commit updates find_least_used_compatible_host method to specify
the pool name when evaluating a VM for migration. Without this change VM
migration fails with a wrong number of arguments error. Pool_manager
test references are updated to reflect the change.
Previously the vsphere provider assumed that there was one and only one
datacenter (DC) in the vsphere instance. However this is simply not true for
many vSphere installations. This commit:
- Adds the ability to define a vSphere datacenter at the Pool or Provider level
whereby the Pool setting takes precedence
- If no datacenter is specified the default behaviour of picking the first DC
in the vSphere instance
- Updated all tests for the new setting
- Update the vmpooler configuration file example with relevant setting name
and expected behaviour
- Fixed a bug in the rvmomi_helper whereby if no DC was found it would return
all DCs. This is opposite behaviour of the real RBVMOMI library as it returns
nil
Previosuly in find_vmdks in vsphere_helper it uses the call
vmdk_datastore._connection to get the underlying connection, however this is
already available as function parameter. This commit removes this bad code and
the associated test fixtures.
Refactoring the vmpooler.yaml format to support multiple providers.
The second level key under :providers: is a unique key name that
represents a provider that can be refered in the pool's parameter
called provider. The code is still backward compatible to support
the :vsphere: and :dummy: keys but in reality if you have more than
one vsphere configuration you would give them a different name. For
example :vsphere-pdx: and :vsphere-bfs: and the actual provider
class would be specified as a parameter called 'provider_class'.
See tests and examples for more information.
The generic connection pooler is only responsible for managing the connection
objects, however the providers themselves are responsible for ensuring that the
connection is alive/healthy etc. Previously, the older vSphere helper would
reconnect however this was lost when the connection pooler was introduced. This
commit adds a method that checks the connection before use, and then reconnects
if the connection is in a bad state.
Previously the vSphere Provider would share a single vSphere connection for all
pools under management. This would cause issues in large environments as this
would cause errors to be thrown or operations to slow down. This commit
modifies the vSphere Provider to use a connection pool when communicating with
the vSphere API
- Uses the GenericConnectionPool object to manage the connection pool
- Uses a default connection pool size of:
Whatever is biggest from:
- How many pools this provider services
- Maximum number of cloning tasks allowed
- Need at least 2 connections so that a pool can have inventory functions
performed while cloning etc.
- A large connection_pool_timeout is used as a connection object is consumed
during a VM clone, which can take up to 2 minutes
- Removes the `get_connection` method as that is now obsolete due to the
connection pool
- Removes the `close` method as it is now obsolete
- Modified the spec tests slightly, to stop mocking get_connection as it no
longer exists, and set a super low pool timeout so that if a test fails, it
will fail quickly instead of taking the default time of 60+ seconds
Previously a connection pooler class was added. This commit modifies the Dummy
VM Provider to use a connection pooler. While the Dummy provider strictly
speaking does not use connections, this allows testing to see what happens when
connection pools are stressed or exhausted. This commit:
- Modifies functions to use a connection pool object for the public API
functions
- Modifies the VMPooler YAML with new settings for connection pool size and
timeout
Previously VMPooler had no concept of a connection pooler. While there is an
up to date connection pooler Gem (connection_pool), that supports MRI and jRuby,
it lacked metrics which are useful to diagnose errors and judge pool size.
This commit:
- Brings in the connection_pool gem
- Creates a new class called generic_connection_pool which inherits from the
ConnectionPool class in the connection_pool gem.
- Extends the connection pool object with a new function called `with_metrics`
This copies the code from the original `with` method but emits metrics for
how long it took to get an object from the pool, and then how many objects
are left in the pool. This is sent using VMPooler's metrics object.
Extending the object was used instead of overriding as it was not possible to
inject into the existing function and monkey patching did not seem the correct
way.
In order use the metics, the GenericConnectionPool object modifies the
initialize method to use :metrics and :metrics_prefix options
- Also added tests for the GenericConnectionPool class to ensure the new
functions are tested. Note that the functionality that was not extended is
not tested in VMPooler.
In previous commits the code from vsphere_helper is now all moved to the vSphere
Provider. This commit removes the vsphere_helper.rb file, spec tests and from
being loaded by vmpooler itself.
Previously the vSphere based configuration was in the root of the configuration
YAML. As there is deprecation support to move the old configuration to the new
location, the vSphere provider can be updated. This commit updates the vSphere
Provider and tests to use the new configuration location under:
:providers:
:vsphere:
This commit modifies execute! to create the VM Providers on VMPooler startup
instead of check_pool creating a provider per pool. This commit also adds
legacy support for old configuration files:
- Setting the default provider for pools to be vsphere
- Copying VSphere connection settings in the configuration file from the legacy
location in the root, to under :providers/:vsphere which is new location for
all provider configuration
This commit adds a public function to access the internal variable holding the
VMPooler configuration. This is required for later commits for the execute!
function testing.
Previously the Pool Manager would use vSphere objects directly. This commit
- Modifies the pool_manager to use the VM provider methods instead
- Removes the MockFindFolder class as it is no longer required
- Minor update for rubocop violations
Previously the Pool Manager would use a single VM provider per Pool. This
commit changes Pool Manager to use a single provider that services multiple
pools.
Previously the Pool Manager would use vSphere objects directly. This commit
- Modifies the migrate_vm_and_record_timing method to use VM and Pool names
instead of VM and Pool objects.
Previously the Pool Manager would use vSphere objects directly. This commit
removes get_vm_host_info as this functionality is now in the vSphere VM
Provider.