Commit graph

249 commits

Author SHA1 Message Date
Jake Spain
c88f5d8b5a
Fix missing function when moving metrics from pool_manager to api/v1
Fix dependency function component_to_test for jenkins metrics that was not copied from pool_manager to api/v1 in https://github.com/puppetlabs/vmpooler/pull/455
2021-08-25 09:27:45 -04:00
Jake Spain
c06cfc28f7
Add operation label to user metric and move from manager to api
This adds an "operation" label to the user metrics and moves incrementing from the manager to api, so that the user metrics show when resources are allocated, as well as destroyed. Previously, user metrics were only updated upon destroying a resource.

I think its better suited to increment the metric as part of the api instead of the pool_manger, because it's expected to do so when a user successfully checks out or deletes a VM, but can be problematic when doing so in the provider since it can clone VMs before actually being checked out by a user.
2021-08-13 11:23:10 -04:00
7a1fc24685
Fix regex for ondemand instances
It appears we renamed `/ondemand/` to `/ondemandvm/` at some point and,
as a result, have not been stripping hostnames from that endpoint's
metrics. This has caused issues with metrics collection due a very high
cardinality.
2021-02-02 11:11:17 -05:00
Belen Bustamante
2a6d610b7a Rubocop fix 2020-10-23 17:41:02 -07:00
suckatrash
b2ac53fa76
(DIO-1059) Optionally add snapshot tuning params at clone time 2020-10-16 16:29:29 -07:00
kirby@puppetlabs.com
0ad069b958 (POOLER-191) Add checking for running instances that are not in active
This change adds detection of running instances that are in a running
queue, but have no data in a active queue for the same pool. When this
happens a machine will live forever, impacting the running count, and
preventing the machine from being killed. Without this change running
instances that are not marked as active will live forever.
2020-10-13 14:17:51 -07:00
Belen Bustamante
08132f75bd
Add healthcheck endpoint, spec testing 2020-09-22 12:25:26 -07:00
Samuel Beaulieu
a35d66c606 (POOLER-184) Pool manager retry and exit on failure
Adding a reconnect retry for redis, which by default would retry 10 times,
for a total wait time of ~80 seconds
2020-09-02 11:38:40 -05:00
Belen Bustamante
07d1ca2b2c Add promstats component check 2020-08-29 09:40:29 -07:00
John O'Connor
18988ddc3c (MAINT) Fix Staledns error counter
This was logging the hostname instead of the poolname.
2020-08-21 19:03:19 +01:00
John O'Connor
3050e99fd6 (MAINT) Normalise all tokens for stats
/api/v1/token, /token, /img and /lib endpoints need to be normalised the
same way that /vm and /ondemandvm endpoints are handled.
2020-08-21 17:24:56 +01:00
kirby@puppetlabs.com
c6fdc8d6fd (POOLER-186) Fix template alias evaluation with backend weight of 0
This change fixes template alias evaluation to ensure that the correct
data is set when generating on demand requests for pools that have a
backend weight configured for a value of 0. Without this change vmpooler
will return an empty selection in api for template alias evaluation.

To support this change tests are added that first reproduced the
failure, and then verified that it is resolved with the addition of the
patch. Additionally, test coverage is added to ensure that code paths
that include pickup gem usage are covered.
2020-08-05 16:33:10 -07:00
John O'Connor
0a6ad896f5 (MAINT) Clarity refactor of Prom Stats code
Introducing the Prometheus Stats code into ABS showed that the Clarity
could be improved a bit with better variable naming, some refactoring
to reduce repitition and documenting the Metrics table itself.

Filtering these changes back to the vmpooler code base.
2020-08-05 14:57:48 +01:00
kirby@puppetlabs.com
ef4ca261d0 Fix vmpooler folder purging
This commit updates folder purging references to ensure that provider
name references are referring to the named provider, rather than the
provider type. Without this change folder purging fails because it
cannot identify target folders.
2020-08-03 15:08:03 -07:00
John O'Connor
85ff3f7022 (MAINT) Add optional API Request Logging
This was partially an exercise to use middleware, but also to enable
basic logging for the API by logging all the API calls to the logger.
2020-06-29 19:56:29 +01:00
John O'Connor
a21d8c5642 (POOLER-178) Target Stats for api & manager
Ensure that the correct stats are registered for the Manager and the api
respectively. E.g. all checkout counters are for the api only, whereas
clone times belong to the manager.

Also new ondemand functionality stats weren't registered, so add these
along with missing delete stats.
2020-06-29 19:54:28 +01:00
John O'Connor
8ed8c43970 (POOLER-160) Revise Metrics Classwork
Review changes suggested to revise the Metrics related files into a more
logical class structure.

Also fixup grammar typos in docs strings and any trailing metrics that
have been recently added to vmpooler.
2020-06-29 19:54:28 +01:00
John O'Connor
cb955a1bed (POOLER-177) Filter hostname from API Paths
Use the example provided in the Ruby Client to provide a customised
collector appropriate to log all calls to the API. The customised
filtering is used to replace individual node names and templates
for the /vm and request ID's for the /ondemand endpoints.

This module was failing our rubocop checks so have updated it since
it now forms part of vmpooler.

Separate trapping for litmus jobs is also included so that they don't
interfere with stats from the jenkins pipelines.
2020-06-29 19:53:59 +01:00
John O'Connor
b6dcd77228 (POOLER-170) Revise vmpooler usage stats
Break down the usage stats into smaller groups so as to manage the
number of stat lines collected for Prometheus.

This may need some further revision to filter out Litmus stats, or
otherwise collect litmus usage information.
2020-06-29 19:52:15 +01:00
John O'Connor
72564de4b4 (POOLER-160) Revise connection metrics
The redis pooler connection metric used "metric_prefix" which is
misleading, so split this into connpool_type and connpool_provider.
Also remove some earlier jruby compatibility code to reduce
rebase conflicts when this is rebased on top of Matt's changes.
2020-06-29 19:52:14 +01:00
John O'Connor
bbd76bde4c (POOLER-160) Add Prometheus to pooler startup
This is a re-architect of the vmpooler initialisation code to:
1. Allow an API service for both manager and the api
2. Add the Prometheus endpoints to the web service.
   Needed to change the way the Rack Service is started as instantiating
   using ".New" leads to a failure to initialise the http Stats
   collection.
3. Selectively load the pooler api and/or Prometheus endpoints.
4. Rework API Spec tests for revised API loading. Needed to tidy up the
   initialisation and perform a reset! after each test to avoid "leaks"
   and dependencies between the tests.
2020-06-29 19:52:14 +01:00
John O'Connor
ffab7def9e (POOLER-160) Add Prometheus Stats Feeds
Add a new Prometheus class as an additional stats feed along with the
existing feeds.

Move the metrics initialisation code into its own class and sub-class
the individual metrics implementations under this.
2020-06-26 21:37:22 +01:00
mattkirby
a2a3bb7dfd
Merge pull request #382 from puppetlabs/pooler-167
(POOLER-167) Allow for network configuration at vm clone time
2020-06-23 14:30:25 -07:00
Belen Bustamante
3dfd70fa0e Allow for network configuration at vm clone time 2020-06-12 11:56:41 -07:00
Samuel
4ecd5dea51
(POOLER-174) Reduce duplicate of on demand code introduced in POOLER-158 (#383)
* (POOLER-174) Reduce duplicate of on demand code introduced in POOLER-158
refactored every parsing of request of type 'pool_alias:pool:count' into a
utility class, that is used by pool_manager and the api v1 class
* add some metrics to the od request generation
* fix rubocop offenses, we are now friends
2020-06-11 12:39:34 -05:00
kirby@puppetlabs.com
811fd8b60f (POOLER-158) Add capability to provision VMs on demand
This change adds a capability to vmpooler to provision instances on
demand. Without this change vmpooler only supports retrieving machines
from pre-provisioned pools.

Additionally, this change refactors redis interactions to reduce round
trips to redis. Specifically, multi and pipelined redis commands are
added where possible to reduce the number of times we are calling redis.

To support the redis refactor the redis interaction has changed to
leveraging a connection pool. In addition to offering multiple
connections for pool manager to use, the redis interactions in pool
manager are now thread safe.

Ready TTL is now a global parameter that can be set as a default for all
pools. A default of 0 has been removed, because this is an unreasonable
default behavior, which would leave a provisioned instance in the pool
indefinitely.

Pool empty messages have been removed when the pool size is set to 0.
Without this change, when a pool was set to a size of 0 the API and pool
manager would both show that a pool is empty.
2020-06-03 14:00:04 -07:00
Samuel Beaulieu
6304743240 (POOLER-166) Check for stale dns records 2020-05-29 12:05:54 -05:00
Belen Bustamante
477f270b52 Enable support for multiple user objects 2020-05-28 10:42:47 -07:00
Brandon High
5fac5684a9
Merge pull request #371 from puppetlabs/POOLER-161
(POOLER-161) Fix extending vm lifetime when max lifetime is set
2020-04-15 13:27:27 -07:00
Samuel Beaulieu
953aa68907 (POOLER-161) Fix extending vm lifetime when max lifetime is set
Before this PR, the current running time was being inspected to decide if the
vm lifetime could be extended. But since vm lifetime is absolute and not relative
this check is now removed.
2020-04-15 13:17:14 -05:00
kirby@puppetlabs.com
68ecb7a3a4 (POOLER-165) Fix purge_unconfigured_folders
This commit fixes the purge_unconfigured_folders feature to ensure that it can successfully identify folders and instances that are no longer used. Without this change the feature does not work as advertised.
2020-04-07 09:39:59 -07:00
kirby@puppetlabs.com
283dea62a7 (POOLER-156) Detect redis connection failures
This commit adds detection for redis connection failures to pool_manager. When a connection fails the error will be raised to executeforcing the connection to be re-established. Without this change, when a redis connection fails, it generates a redis connection error, which is swallowed by a rescue for StandardError, preventing the manager application component from recovering in the case of a redis connection failure.
2020-03-17 11:17:52 -07:00
Brandon High
daa1a99073
Correct keyword arguments to traverse in create_folder
This commit fixes this call to `dc.vmFolder.traverse` to simply pass in
the arguements instead of assigning to variables that are discarded.
2020-03-05 17:10:22 -08:00
kirby@puppetlabs.com
8bb89b604d (POOLER-157) Add extra_config option to vmpooler
This commit adds the extra_config option to vmpooler to allow specifying additional configuration files to load from. Without this change vmpooler does not offer a mechanism to provide additional configuration files for the application.
2020-03-04 17:19:24 -08:00
Brandon High
82dae7d04c
Merge pull request #353 from mattkirby/vmpooler_flush
(POOLER-153) Add endpoint for resetting a pool
2020-02-14 09:53:47 -08:00
kirby@puppetlabs.com
52b60b074c (POOLER-153) Add endpoint for resetting a pool
This commit adds a capability to vmpooler to reset a pool, deleting its ready and pending instances and replacing them with fresh ones. Without this change vmpooler does not offer a mechanism to reset a pool without also changing its template.
2020-02-13 11:59:44 -08:00
mattkirby
d0257e39f7
Merge pull request #348 from Secure-24/issue_205
Support nested host folders in find_cluster()
2019-12-12 11:15:26 -08:00
Samuel
ae10bd4e22 (POOLER-123) Implement a max TTL (#349)
* (POOLER-123) Implement a max TTL

Before this change, we could checkout a vm and set the lifetime to a
very high number which would esssentially keep the vm running forever.
Now implementing a config setting max_lifetime_upper_limit which enforces
a maximum lifetime in hours both for initial checkout and extending a
running vm

* (POOLER-123) Improve PUT vm endpoint error messaging

Prior to this commit the PUT vm endpoint didn't give any useful
information about why a user's request failed.

This commit updates PUT to output a more helpful set of error messages
in the `failure` key that gets returned in the JSON response.

* (POOLER-123) Update max_lifetime_upper_limit key

This commit switches the max_lifetime_upper_limit key from being a
symbol to being a string, which is what the config hash seems to contain.

* (maint) Add option to disable Redis persistence in docker-compose

This commit is just a handy little command override to the redis
container to prevent persistence.
2019-12-05 09:35:30 -07:00
Sean Millichamp
f6fdfe42d7 Support nested host folders in find_cluster()
Search the root and any subfolders for cluster or host resources.
2019-11-26 13:48:53 -05:00
Brandon High
019ed021b0
(QENG-7530) Add check for unique hostnames
Prior to this commit the pooler had no awareness of the complete set of
hostnames that are currently in use. This meant that it was possible to
allocate the same hostname twice, which would result in the original
host with that hostname becoming unreachable.

This commit adds a check for the existence of the
`vmpooler__vm__<hostname>` key before attempting to  clone the vm.
This should prevent duplicate hostnames.

If the hostname is already taken, `_clone_vm` will retry with a new
random hostname multiple times before raising an exception.
2019-11-04 15:53:05 -08:00
Brandon High
625472df35
(QENG-7530) Fix hostname_shorten regex
Prior to this commit the hostname_shorten regex wouldn't match the
updated human readable hostnames because they contain dashes.
This commit updates the regex to capture dashes in the hostname, and
adds a few specs to verify that behavior.
2019-11-01 10:30:56 -07:00
kirby@puppetlabs.com
30bf2074fe (POOLER-150) Synchronize checkout operations for API
This commit adds a shared mutex to vmpooler API so that checkout requests can be synchronized across threads. Without this change it is possible in some scenarios for vmpooler to allocate the same SUT to different API requests for a VM.
2019-10-21 14:24:34 -07:00
kirby@puppetlabs.com
267772d8eb (POOLER-147) Fix create_linked_clone pool option
This commit updates the create_linked_clone pool option to correctly detect when linked clones have been set at a pool level. Without this change a pool setting create_linked_clone to false is not interpreted correctly, and a linked clone is created if possible.
2019-08-22 14:32:06 -07:00
kirby@puppetlabs.com
9c5a0d114b (POOLER-142) Add running host to vm API data
This change adds the running host for a VM to the API data available via /vm/hostname. Without this change the running host would be logged to vmpooler log, but not available any other way. Additionally, the data will specify if a machine has been migrated. Without this change parent host data for a vmpooler machine is not available via the vmpooler API.
2019-08-21 12:43:06 -07:00
kirby@puppetlabs.com
09a382a10f Make it possible to disable linked clones
This commit adds a new configuration parameter to allow setting whether to create linked clones on a global, or per pool basis. Without this change vmpooler would always attempt to create linked clones. The default behavior of creating linked clones is preserved.
2019-08-06 14:29:31 -07:00
Spencer McElmurry
98a547b807 (POOLER-143) Add clone_target config change to API
This allows the user to change the cluster in which the targeted pool
will clone to. Upon configuration change, the thread will wake up and
execute the change within 1 second.
2019-07-29 08:53:27 -07:00
kirby@puppetlabs.com
d6e948d34d (POOLER-140) Ensure a VM is alive at checkout
This commit duplicates the vm_ready? check to the API layer to allow for API to validate that a VM is alive at checkout. Without this change API relies upon the checks in pool_manager validating pools. This change should allow for additional insight into whether a machine is in a ready state and resopnding at checkout time.
2019-07-17 12:03:55 -07:00
Samuel Beaulieu
8eb15f8d10 (maint) Optimize the status api using redis pipeline
Before this change looping over many pools would query the redis backend
for each pool, leading in slow response from the backend for configurations
with many pools (60+)
Changed the requests to use redis pipelines https://redis.io/topics/pipelining
This is supported since the beginning, so will not force any redis update for
users. The pipeline method runs the queries in batches and we need to loop
over the result and reduces the number of requests to redis by N=number of
pools in the configuration.
2019-04-18 13:24:40 -05:00
Spencer McElmurry
ce6ea1662d
Merge pull request #324 from puppetlabs/QENG-7201
(QENG-7201) Vmpooler pool statistic endpoint optimization
2019-04-17 11:56:48 -05:00
Samuel Beaulieu
714a9edf5e (QENG-7201) Adding docs and tests 2019-04-15 13:09:53 -05:00