It appears we renamed `/ondemand/` to `/ondemandvm/` at some point and,
as a result, have not been stripping hostnames from that endpoint's
metrics. This has caused issues with metrics collection due a very high
cardinality.
This change removes usage of redis multi from API. Without this change the redis usage can cause issues because no connection pool is used, so another worker may try to use the same backend object causing a failure.
Checkout metric counters were against the template name and not the
actual pool used which prevents us from counting the checkouts in the
pixa4 pools for example.
Note - this is a re-rerun - the last commit on this file over-wrote
the change.
This change fixes an error in validate_token where hset does not have the correct number of parameters. Without this change validate_token causes a error.
This change sets a VM as running in redis as soon as it is checked out. Without this change when allocated several instances it is possible for a machine that has been allocated for a checkout, but not yet marked as active, to be identified as running when it should not be, which was added in POOLER-191. Without this change a machine may be destroyed during checkout by pool_manager if there are several instances being allocated.
Additionally, redis multi is added for vm checkout operations to minimize the round trips to redis during a checkout operation. Without this addition each VM checkout causes several redis interactions.
Checkout metric counters were against the template name and not the
actual pool used which prevents us from counting the checkouts in the
pixa4 pools for example.
* (maint) Speedup the tagging method
While looking at the instrumentation data for the ABS queue processor,
I noticed a lot of time spent in the HTTP PUT method, which in the code
was easy to isolate, as it is only used via the vmpooler tagging functions
ie the API /vm/foobar/ with 'tag' key-value pairs.
While I'm not sure the original hset() make sense to me, there was an easy
way to speed them up by using pipelined. I would expect a very good speed
increase with this turned on.
* tag rubocop to <1.0 because the 1.0 version returns 130 new offenses
This change adds detection of running instances that are in a running
queue, but have no data in a active queue for the same pool. When this
happens a machine will live forever, impacting the running count, and
preventing the machine from being killed. Without this change running
instances that are not marked as active will live forever.
This update includes two key benefits:
1. Spans will be named based on their route instead of the full path
info thanks to https://github.com/open-telemetry/opentelemetry-ruby/pull/415
2. Helper methods were added to the configurator to simplify setting
service.name and service.version
This change utilizes OpenTelemetry's automatic instrumentation to add
distributed tracing capabilities to VMPooler. This is a non-breaking
change as traces are processed in noop mode by default.
This change fixes template alias evaluation to ensure that the correct
data is set when generating on demand requests for pools that have a
backend weight configured for a value of 0. Without this change vmpooler
will return an empty selection in api for template alias evaluation.
To support this change tests are added that first reproduced the
failure, and then verified that it is resolved with the addition of the
patch. Additionally, test coverage is added to ensure that code paths
that include pickup gem usage are covered.
Introducing the Prometheus Stats code into ABS showed that the Clarity
could be improved a bit with better variable naming, some refactoring
to reduce repitition and documenting the Metrics table itself.
Filtering these changes back to the vmpooler code base.
This commit updates folder purging references to ensure that provider
name references are referring to the named provider, rather than the
provider type. Without this change folder purging fails because it
cannot identify target folders.
This commit updates the method used for chceking the status of an
ondemand request to ensure that if multiple aliases are used to fulfill
a request that they are correctly presented as a single pool again when
everything is ready. Without this change it is possible for only one
group of an aliased pool to show up in pending or completed requests.
Ensure that the correct stats are registered for the Manager and the api
respectively. E.g. all checkout counters are for the api only, whereas
clone times belong to the manager.
Also new ondemand functionality stats weren't registered, so add these
along with missing delete stats.