admin: fix manual job run to use scheduler dispatch with capacity management and retry (#8720)
RunPluginJobTypeAPI previously executed proposals with a naive sequential loop calling ExecutePluginJob per proposal. This had two bugs: 1. Double-lock: RunPluginJobTypeAPI held pluginLock while calling ExecutePluginJob, which tried to re-acquire the same lock for every job in the loop. 2. No capacity management: proposals were fired directly at workers without reserveScheduledExecutor, so every job beyond the worker concurrency limit received an immediate at_capacity error with no retry or backoff. Fix: add Plugin.DispatchProposals which reuses dispatchScheduledProposals - the same code path the scheduler loop uses - with executor reservation, configurable concurrency, and per-job retry with backoff. RunPluginJobTypeAPI now calls DispatchPluginProposals (a thin AdminServer wrapper) after holding pluginLock once. Co-authored-by: Anton Ustyugov <anton@devops>
This commit is contained in:
@@ -1284,6 +1284,23 @@ func (s *AdminServer) RunPluginDetectionWithReport(
|
||||
return s.plugin.RunDetectionWithReport(ctx, jobType, clusterContext, maxResults)
|
||||
}
|
||||
|
||||
// DispatchPluginProposals dispatches a batch of proposals using the same
|
||||
// capacity-aware dispatch logic as the scheduler loop (executor reservation with
|
||||
// backoff, per-job retry on transient errors). The plugin lock must already be
|
||||
// held by the caller.
|
||||
func (s *AdminServer) DispatchPluginProposals(
|
||||
ctx context.Context,
|
||||
jobType string,
|
||||
proposals []*plugin_pb.JobProposal,
|
||||
clusterContext *plugin_pb.ClusterContext,
|
||||
) (successCount, errorCount, canceledCount int, err error) {
|
||||
if s.plugin == nil {
|
||||
return 0, 0, 0, fmt.Errorf("plugin is not enabled")
|
||||
}
|
||||
sc, ec, cc := s.plugin.DispatchProposals(ctx, jobType, proposals, clusterContext)
|
||||
return sc, ec, cc, nil
|
||||
}
|
||||
|
||||
// ExecutePluginJob dispatches one job to a capable worker and waits for completion.
|
||||
func (s *AdminServer) ExecutePluginJob(
|
||||
ctx context.Context,
|
||||
|
||||
Reference in New Issue
Block a user