Commit graph

132 commits

Author SHA1 Message Date
Nick Lewis
c4f3a49782 Generate a wider set of legal names
Previously, we restricted the adjective and noun portion of the name
each to 7 characters to ensure that the final name would not be more
than 15 after adding a hyphen. Given that the _total_ length is what
matters, we can generate a noun up to 11 characters (to ensure we leave
room for a hyphen and a 3 letter adjective) and adjust our acceptable
adjective size accordingly. This lets many more names be generated than
would otherwise, while still respecting the 15 character limit.

Due to the limited set of 11 letter nouns and corresponding 3 letter
adjectives, as well as some complex combinatorics, setting the noun
length to 11 causes a net increase in conflicts. We therefore actually
set it to 10, which causes a net decrease in conflicts.

We favor generating longer nouns rather than longer adjectives (by
selecting the noun first) because longer adjectives tend to be more
unwieldy words, and thus more awkward to say and generally less fun.
2020-01-27 16:11:13 -08:00
John O'Connor
f581d065ae (QENG-7531) Add Marked as Failed Stat
This is a useful measure for monitoring the health of pools that we
don't capture yet.
2019-12-11 15:24:57 +00:00
Brandon High
019ed021b0
(QENG-7530) Add check for unique hostnames
Prior to this commit the pooler had no awareness of the complete set of
hostnames that are currently in use. This meant that it was possible to
allocate the same hostname twice, which would result in the original
host with that hostname becoming unreachable.

This commit adds a check for the existence of the
`vmpooler__vm__<hostname>` key before attempting to  clone the vm.
This should prevent duplicate hostnames.

If the hostname is already taken, `_clone_vm` will retry with a new
random hostname multiple times before raising an exception.
2019-11-04 15:53:05 -08:00
Brandon High
2ca6d49aeb
(QENG-7530) Make VM names more human readable
Prior to this commit hostnames for VMs provisioned by vmpooler were 15
random characters. This is difficult for humans to tell apart.

This commit updates the naming to use the `spicy-proton` gem to generate
adjective noun pair names for the VMs, which I think would be easier for
humans to tell apart, as well as fun and memorable to say.

The random name should not exceed 15 characters in order to prevent
issues with NETBIOS, etc as discussed in the attached ticket.
2019-10-24 09:12:18 -07:00
Andrew Makousky
a2548d11e8
(POOLER-148) Improve code comment.
Co-Authored-By: Samuel <samuel@puppet.com>
2019-09-04 01:32:20 +00:00
Andrew Makousky
79ac9ad37e (POOLER-148) Fix undefined variable bug in _check_ready_vm.
The host['boottime'] variable in the function _check_ready_vm no longer
has its parent object in reference due to the refactoring in pull
request #269.  So in order to get the same information without the
performance impact from duplicate object lookups, we get similar
information from the time that the VM is ready.
2019-08-30 18:32:14 -05:00
Spencer McElmurry
98a547b807 (POOLER-143) Add clone_target config change to API
This allows the user to change the cluster in which the targeted pool
will clone to. Upon configuration change, the thread will wake up and
execute the change within 1 second.
2019-07-29 08:53:27 -07:00
kirby@puppetlabs.com
2d94cb4266 (POOLER-141) Fix order of processing migrating and pending queues
This commit updates how migrating and pending queues are processed. Sets to be processed are created with sadd in redis, and iterated over as a list in ruby. The latest member is added to the beginning of this set in redis, and becomes the first member of the set in ruby. To ensure that items are processed in the order they are added it is necessary to reverse the list before iterating through its members. Without this change the newest members of the set are processed first, which creates inconsistent times to evaluation.
2019-04-11 12:29:05 -07:00
kirby@puppetlabs.com
2de4a6db0b Ensure nodes are consistent for usage stats
This commit updates vm usage stats collection to replace all instances of '.' characters within node strings. Without this change the node string containing a '.' character causes the metric to be interpreted as containing another node.
2018-12-19 12:45:50 -08:00
kirby@puppetlabs.com
9a57c6d1b5 (POOLER-134) Ship VM usage stats
This commit updates vmpooler to ship VM usage stats when a VM is destroyed. The stats are gathered for jobs based on user and pool name. If a jenkins build URL is present then this is broken down by user, instance, value stream, branch and project. Additionally, if present then the RMM_COMPONENT_TO_TEST_NAME will be listed after project. Without this change we do not collect stats on per VM usage and its correlation to users and pools.
2018-12-07 11:59:27 -08:00
kirby@puppetlabs.com
3c856d7ae9 (POOLER-133) Identify when a ready VM has failed
This commit fixes checking of a VM that has already been identified as ready. Without this change a ready VM that has failed will be identified as having failed, but will not successfully be removed from the ready queue. Additionally, the default vm_checktime value has been reduced from 15 to 1 to ensure that ready VMs are checked within one minute of the time they have reached the ready state by default.

Lastly, the docker-compose files are updated to specify that the redis
instance used is a local redis instance.
2018-12-03 12:21:08 -08:00
kirby@puppetlabs.com
830acd705d Remove debug statements. Return when get_vm returns nil 2018-09-12 12:28:12 -07:00
kirby@puppetlabs.com
cd73f53561 (POOLER-129) Allow setting weights for backends
This commit updates get_vm in the vmpooler API to allow for setting weights for backends. Additionally, when an alias for a pool exists, and the backend configured is not weighted, then the selection of the pool based on alias will be randomly sampled. Without this change any pool with the title of the alias is exhausted before an alternate pool with the configured alias is used, which results in an uneven distribution of VMs. When all backends involved are configured with weighted values the VM selection will be based on probability using those weights.

A bug is fixed when setting the default ttl for check_ready_vm.

Pickup is added to handle weighted VM selection.

A dockerfile is added that allows for building and installing vmpooler
from the current HEAD in docker to make for easy testing.
2018-09-11 11:15:02 -07:00
mchllweeks
0e86937245
Merge pull request #297 from mattkirby/pooler_130
(POOLER-130) Improve delta disk creation handling
2018-08-30 17:22:35 -07:00
kirby@puppetlabs.com
3fd0c6f475 (POOLER-130) Improve delta disk creation handling
This commit updates delta disk creation to reduce the likelihood of this being run more than once for any given template. Without this change an error can be generated with vsphere 6.5 or later when a template is updated, and then the update is reverted. The error prevents the image from being used because the template is never marked as prepared. To address this any failure is now logged, and the template is marked as prepared regardless of whether this was successful, or not, which allows the image to be used despite the error. This failure mode is more graceful and allows the pool to continue to function.
2018-08-30 15:50:07 -07:00
Samuel Beaulieu
3aac95d5b9 (POOLER-114) Refactor check_pool in pool_manager
This commit refactorss the check_pool method in pool_manager.
Specifically, each commented section describing a stage of check_pool is
broken out into a separate method and check_pool is simplified by
calling these methods. Without this change it is difficult to follow the
intent for or make changes to check_pool.

Additionally, a docker-compose file is added to make it simple to launch
an all-in-one vmpooler instance along with a separate redis server with
docker.
2018-08-23 14:04:52 -07:00
Corey Osman
2daa5244b8 Adds a new mechanism to load providers from any gem or file path. (#263)
* Adds ability to load only providers used in config file
2018-07-24 16:35:18 -07:00
kirby@puppetlabs.com
90edc6f979 Remove VM from completed only after destroy
This commit updates destroy_vm to remove the redis member from the completed queue only after a destroy has been completed. Without this change a VM that is being destroyed will be logged as discovered when inventory is checked since it has already been removed from the completed queue.
2018-07-24 16:21:59 -07:00
kirby@puppetlabs.com
490e5c1938 (POOLER-128) Remove references to VM mutex when destroying
This commit updates destroy_vm to remove references to its mutex tracking object when destroyed. Without this change a VM that is destroyed will leave its mutex tracking object forever causing the pool manager memory footprint to increase.
2018-07-23 16:53:53 -07:00
Kevin Imber
099b53f348
Merge pull request #274 from mattkirby/cleanup_unused_folders_and_vms
(POOLER-66) Purge vms and folders no longer configured
2018-07-19 13:19:13 -07:00
mattkirby
95d9c83479
Merge pull request #281 from mattkirby/pooler_109
(POOLER-109) Allow API to run independently
2018-07-16 16:41:37 -07:00
kirby@puppetlabs.com
c64a8afc3f Get whitelist from provider_config 2018-07-13 12:09:28 -07:00
kirby@puppetlabs.com
a34c8dd11b (POOLER-66) Purge vms and folders no longer configured
This commit enables optional purging for vms and folders when they are
not configured in the vmpooler provided configuration. Base folders are
determined from folders specified in the pool configuration. Then,
anything not configured in that folder for that provider and is not a
whitelisted folder title will be destroyed. Without this change vmpooler
will leave unconfigured vms and folders behind and any vms will be left
running forever without manual intervention. Additionally, any
associated redis data will never be expired.
2018-07-13 12:09:28 -07:00
mattkirby
070199e877 Eliminate duplicate VM object lookups where possible (#269)
* Minimize duplicate checking of VMs

This commit updates check_pool pending, running and ready methods to greatly reduce instances in which the VM object is retrieved. Without this change get_vm is run for each of these check_pool steps even though the VM is already validated as being in inventory being running the check. This is left for checking running VMs when the VM is no longer ready. Without this change there is an unnecessarily large volume of VM object checks.

* Make hostname mismatch checking optional

This commit makes hostname mismatch checking optional on a pool and global config level. The default behavior of checking this is preserved. When disabled _check_ready_vm will not run get_vm, which allows for ready VMs to be checked without retrieving an object from vsphere and waiting for a free provider connection. Without this change it is not possible to disable VM object retrieval in _check_vm_ready.

* Check if a hostname return is empty string

This commit checks whether a hostname returned is an empty string.
Without this change a VM that returns a hostname with a empty string
will report as having a hostname mismatch, which may happen before all
VM data is updated.

* Only check hostname for mismatch after time since ready has past

Configure hostname checking so it is not checked until a VM has been
ready for greater than one minute. Without this change hostname checking
will often fail because the configured value has not propogated to the
VM on some platforms before this check is run.
2018-07-13 12:06:44 -05:00
kirby@puppetlabs.com
528020fec7 (POOLER-109) Allow API to run independently
This commit updates vmpooler to allow the API component and dashboard to
run separately from pool_manager. Without this change vmpooler does not
offer a mechanism to run only the API, or pool_manager components.

Two instances of hardcoded puma environment settings are removed. This
is still set in the init script explicitly as well as via an environment
variable in the dockerfile.

To extend the mechanism of running the API or pool_manager components to
instances running in docker an entrypoint is added in the dockerfile.
The entrypoint allows a user to specify whether to run the API or
pool_manager components when running the application. The default
behavior is preserved where both components are run.

To support these changes vmpooler.rb is updated to allow more of the
configuration to be specified via individual environment variables. It
was already possible to specify the entire config block as an
environment variable, but this is more difficult to manage and less of a
standard implementation than specifying individual parameters. Where
specified environment variable options will override a value configured
via the configuration file or environment.

The running pool configuration when starting pool_manager is loaded to
redis at pool_manager start time. This allows the API to load the
running pool configuration from redis and be able to run without
requiring the pool configuration.

Lastly, the dockerfile leveraging entrypoint will no longer start
vmpooler with the init script or write logs to a file. Instead, LOGFILE
is set to /dev/stdout and the vmpooler application is started directly.
This behavior is preferred because the log file writes to disk are an
unnecessary overhead. Without this change the docker installation will
attempt to daemonize the vmpooler application and always requires puma.
2018-07-13 09:35:18 -07:00
Spencer McElmurry
a865e6bd2f (POOLER-34) Ship clone request to ready time to metrics (#277)
* (POOLER-34) Ship clone request to ready time to metrics

Before, we were already capturing this metric but we failed to ship
it anywhere. This ships the appropriate metric as `time_to_ready_state`
Dashboards can be found in grafana.

* Add spec test to ensure metric is being shipped properly on
move_pending_vm_to_ready call.
2018-07-09 16:22:49 -07:00
mchllweeks
2f98b1bd7a
Merge pull request #271 from mattkirby/less_delta_disks
Ensure template deltas are created once
2018-07-09 12:49:08 -07:00
mchllweeks
69f8b21ca8
Merge pull request #270 from mattkirby/no_duplicate_vms_in_pool
Do not run duplicate instances of inventory check for a pool
2018-07-09 12:48:41 -07:00
kirby@puppetlabs.com
70156ba7f7 Do not prepare template when config_template is set 2018-07-02 14:12:17 -07:00
kirby@puppetlabs.com
1b17cceb01 Ensure template deltas are created once
This commit updates how template delta disk creation is evaluated. Without this change template deltas are created for every template on each applicatoin startup. This change updates this behavior to instead run template delta disk creation only once per template configured for a pool. Without this change it is possible to get a template to a state where the XML depth is too great to be read with default settings and the template requires a new clone to resolve.
2018-07-02 09:53:41 -07:00
kirby@puppetlabs.com
df89617fdc Do not run duplicate instances of inventory check for a pool
This commit updates check_pool inventory check to prevent multiple instances from running at once. Without this change the inventory check may run in multiple threads simultaneously.
2018-06-28 20:32:44 -07:00
kirby@puppetlabs.com
3a6e2a5cac (POOLER-31) Expire redis vm key when clone fails
This commit updates pool_manager to expire a redis VM key when a clone fails. Without this change VMs that fail to clone have their metadata left forever.
2018-06-20 17:27:31 -07:00
kirby@puppetlabs.com
3a0f0880e7 (POOLER-112) Ensure a VM is only destroyed once
This commit implements a vm_mutex hash to allow synchronizing VM operations that should only happen once across threads. Without this change pool_manager will try to evaluate or destroy a VM multiple times, which results in an error being thrown by one of the destroy attempts as only one can succeed and a duplication of resources unnecessarily when there are no errors.
2018-06-20 13:40:49 -07:00
kirby@puppetlabs.com
9bb4df7d8e (POOLER-107) Add configuration API endpoint
This commit adds a configuration endpoint to the vmpooler API. Pool
size, and pool template, can be adjusted for pools that are configured
at vmpooler application start time. Pool template changes trigger a pool
refresh, and the new template has delta disks created automatically by
vmpooler.

Additionally, the capability to create template delta disks is added to
the vsphere provider, and this is implemented to ensure that templates
have delta disks created at application start time.

The mechanism used to find template VM objects is simplified to make the flow of logic easier to understand. As an additional benefit, performance of this lookup is improved by using FindByInventoryPath.

A table of contents is added to API.md to ease navigation. Without this change API.md has no table of contents and is difficult to navigate.

Add mutex object for managing pool configuration updates

This commit adds a mutex object for ensuring that pool configuration changes are synchronized across multiple running threads, removing the possibility of two threads attempting to update something at once, without relying on redis data. Without this change this is managed crudely by specifying in redis that a configuration update is taking place. This redis data is left so the REPOPULATE section of _check_pool can still identify when a configuration change is in progress, and prevent a pool from repopulating at that time.

Add wake up event for pool template changes

This commit adds a wake up event to detect pool template changes.
Additionally, GET /config has a template_ready section added to the
output for each pool, which makes clear when a pool is ready to populate
itself.
2018-06-15 10:15:47 -07:00
Samuel Beaulieu
10245321bf (maint) Add the last boot time for each pool
This commit add a redis hash where there is one key per pool, and the
stored value is the last time a VM was booted e.g. the last time
a VM went from 'pending' to 'ready'. This is also displayed in the
API as lastBoot:'2018-03-23 17:43:39 +0000'. The data can then be
used by any external system, in this case our alarming system.
2018-03-28 11:11:49 -07:00
kirby@puppetlabs.com
048ab4433a Remove unnecessary rescue
This commit removes an unnecessary rescue that results in duplicate clone error messages. Without this change clone failures due to unavailable host resources are logged twice. A log message is added to specify the host the VM is running on when migration_limit is not set and migration is disabled. Lastly, when a migration fails it reports the host the VM is running on in addition to the reason for the failed migration.
2018-01-10 12:02:58 -08:00
kirby@puppetlabs.com
0efb79a133 Move provider_hosts to vsphere provider
This commit updates the providers to move provider_hosts under the vsphere provider, which is the only place it's applicable. Methods where redis is passed through are updated to remove this pass through and use the globally available redis object, where applicable. Remove_vmpooler_migration_vm method is not needed and is removed.
2018-01-10 12:02:58 -08:00
kirby@puppetlabs.com
cd979fc24d Move migrate_vm logic to vsphere provider
This commit moves the migrate_vm logic to the vsphere provider. Without
this change migrate_vm has lots of vsphere specific logic in
pool_manager migrate_vm method.
2018-01-10 12:02:58 -08:00
kirby@puppetlabs.com
23242a7b1c Update pool_manager and vsphere tests to support changes in host selection 2018-01-10 12:02:58 -08:00
kirby@puppetlabs.com
ada79e81f4 (QENG-5305) Check cluster utilization once at a time
This commit adds a global provider_hosts concept in order to allow checking cluster utilization once per interval for a given cluster and retain the results, reusing them for an interval, and tracking the least used set of hosts. Without this change each migration and clone operation inspect host utilization and state for each host in the cluster, which is computationally expensive for vsphere.
2018-01-10 12:02:58 -08:00
Samuel
0b5abd9bd3 Fix no implicit conversion to rational from nil (#239)
* Fix no implicit conversion to rational from nil
Before this change if the boottime was nil, the check_ready
loop would exit on Time.now - host['boottime'] with a TypeError
in jruby. The boottime is nil when the power is Off so moving that check
earlier should catch that bug.

* set test data properly
2017-10-17 17:51:02 -05:00
Glenn Sarti
f209c2b830 (GH-226) Respond quickly to VMs being consumed
Previously in commit 9b0e55f959 the looping period was changed from a static
number to a dynamic one depending on load, however this meant that the operation
to refill a pool was slowed down somewhat.  While not a problem under normal
loads, when a pool was quickly consumed, the pool manager may not respond
quickly enough to refill the pool.  This commit:

- Changes the sleep method, to us a helper sleep method that will wakeup
  periodically and evaluate other wakeup events.  This could be used later to
  exist sleep loops when pooler is shutting down to stop blocking threads
- By default the wakeup_period is set to the minimum pool check loop time, thus
  emulating the behaviour prior to commit 9b0e55f959
- Adds tests for the behaviour
2017-09-05 21:41:32 -07:00
mattkirby
d789dfdfc8 Merge pull request #229 from glennsarti/fix-phantom-vms
(maint) Remove phantom VMs that are in Redis but don't exist in provider
2017-08-01 10:38:25 -07:00
Rob Braden
0e05163825 Merge pull request #231 from glennsarti/dynamically-load-providers
(GH-230) Dynamically load VM Providers
2017-07-25 15:08:55 -07:00
Glenn Sarti
b16a2e6e96 (GH-230) Dynamically load VM Providers
Previously, a static list was used to instantiate VM Pooler Provider objects.
This commit changes the loader to instead interrogate the available clases in
the Vmpooler::PoolManager::Provider namespace and then instantiate from there.

This means class names are not case sensitive and that VM Providers can now be
dynamically loaded from other sources such as gems in the LOADPATH.  No tests
were added as this behaviour is exercised in the execute! tests already.
2017-07-19 12:52:32 -07:00
Glenn Sarti
e55a8825af (maint) Remove phatom VMs and ensure inventory is successful
Previously, if inventory failed for some reason, it would return an incomplete
set of VMs which could then cause the pool to perform off behaviours such as
fill the pool high than it should, or remove VMs which exist.  Also, if the
redis cache of VMs in a pool had a VM but it did not actually exist in the
inventory it would never be removed.

This commit:
- Immediately exits the check_pool if an error occurs during inventory
  collection
- Will mark a VM as completed if it exists in Redis, but does not exist in
  inventory
- Adds tests for these behaviours
2017-07-18 16:53:16 -07:00
Glenn Sarti
9c93d2534f (maint) Fix rubocop offenses
This commit fixes the many rubocop offenses.  Also modifies the rubocop
settings:

- Set max method params higher than the default of 5
- Ignore Style/GuardClause. In some cases it's eaiser to read without the guard
- Renamed a cop
2017-07-18 15:26:27 -07:00
Rob Braden
9b0e55f959 Merge pull request #227 from glennsarti/add-check-skew
(GH-226) Use a dynamic pool_check loop period
2017-07-12 23:19:51 -07:00
Glenn Sarti
30946fab8e (GH-226) Use a dynamic pool_check loop period
Previously the check_pool would always check the pool every 5 seconds, however
with a large number of pools, this can cause resource issues inside the
providers.  This commit:
- Introduces a dynamic check_pool period which increases during stability and
  decreases when the pool is being change in an important way
- Surfaces the settings as global config defaults but can also be set on a per
  pool basis
- Adds defaults to emulate the current behaviour
- Unit tests for the new behaviour
2017-07-12 17:13:21 -07:00
kirby@puppetlabs.com
c750657c6f Update find_least_used_compatible_host to specify pool
This commit updates find_least_used_compatible_host method to specify
the pool name when evaluating a VM for migration. Without this change VM
migration fails with a wrong number of arguments error. Pool_manager
test references are updated to reflect the change.
2017-06-28 08:01:47 -07:00