From unknown Fri Jun 20 05:30:24 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#43773 <43773@debbugs.gnu.org> To: bug#43773 <43773@debbugs.gnu.org> Subject: Status: guix offload scheduler/load balancer throttles itself Reply-To: bug#43773 <43773@debbugs.gnu.org> Date: Fri, 20 Jun 2025 12:30:24 +0000 retitle 43773 guix offload scheduler/load balancer throttles itself reassign 43773 guix submitter 43773 Maxim Cournoyer severity 43773 normal tag 43773 patch thanks From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 02 23:03:49 2020 Received: (at submit) by debbugs.gnu.org; 3 Oct 2020 03:03:49 +0000 Received: from localhost ([127.0.0.1]:42803 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kOXps-0001ov-Kc for submit@debbugs.gnu.org; Fri, 02 Oct 2020 23:03:48 -0400 Received: from lists.gnu.org ([209.51.188.17]:60716) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kOXpp-0001ok-Rz for submit@debbugs.gnu.org; Fri, 02 Oct 2020 23:03:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44702) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kOXpp-00011b-LV for bug-guix@gnu.org; Fri, 02 Oct 2020 23:03:45 -0400 Received: from mail-qk1-x735.google.com ([2607:f8b0:4864:20::735]:34140) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kOXpl-0001MS-Gr for bug-guix@gnu.org; Fri, 02 Oct 2020 23:03:45 -0400 Received: by mail-qk1-x735.google.com with SMTP id c62so5337391qke.1 for ; Fri, 02 Oct 2020 20:03:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:mime-version; bh=0emW0dnNRn/jr1p5jPDdlrE7o8ervXUXWYhw//Z6jhs=; b=WseFE/QR5khvCnga+89aNDAPw5Jb52hXtJJ6kNW+NmYdQ9UGSZOtgHgr2IgeJgQ6zH D/u7qkZhGDSroJQcyv3rwpD/WhfBvXI2R62L697jDOH9Q6BKwLhdsV13Ha0NoTrbWjrq gVJ3FQJBw24Xwatqtk6AeftwRWOHW69DKBhooi91fWG5rbzyS531vzMtIaiywF9SBZGw d2DNdbRUYjSv1euTorB/xl/Ks+b2zs4kjHmW8rYyhG6PrsVKgm47IENfRkzrLoHH4LoT AYd1/e8+5LSSQ0RZv+BleQL7z3rZFozrErTIXgew/ITh+hhNFwRz/Q4KIOnuYL1fkKOg slEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version; bh=0emW0dnNRn/jr1p5jPDdlrE7o8ervXUXWYhw//Z6jhs=; b=N2JhZqcmCHbf0Bx2FQ36WeXN4SJg7/Kai70gslEc6rtRGKU2hszUTx+HggarwMTIGa u2Ekd2AfvteG017heXw8HM6UN68bdDiNqWWnafOrMzFhTXvUCpgqHJCaD5Ka1ZV48Emz uunl7uTaK3GVjf7FY0Tkl9DbDRJj43fsJdiOKcB4EFio/i6zFip5mQzrSDm4wOnoXHRd HDR89xVRGBjmiky1ZpjmDBU1GETI0RahKwweriytxt0JHyuAkxkNesICuVbxz8xj1+Ii ZKfVXffMbB7cFHrbPzwjtekVJwkDrnbHdwP2AtmZk+n84iJO1ITdXuIFbgMnOHVN49gb ZWxQ== X-Gm-Message-State: AOAM531skfwertiAhR/gkHGctGAyNhkv4zodF+FMGXb//GjGO4cQ6qDH MLrcJa0szzR7Idg1YuJyEs/WXgKzBC+iWA== X-Google-Smtp-Source: ABdhPJzucxbBgWiD2IC7qOjfZgQs92QIAg/nA9Nn4gefzY2FlWfG2jtO6T2hs7ELcygeStj5hRPzsg== X-Received: by 2002:a05:620a:222:: with SMTP id u2mr4937812qkm.218.1601694218994; Fri, 02 Oct 2020 20:03:38 -0700 (PDT) Received: from hurd (dsl-10-135-18.b2b2c.ca. [72.10.135.18]) by smtp.gmail.com with ESMTPSA id 7sm2520972qkh.60.2020.10.02.20.03.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Oct 2020 20:03:38 -0700 (PDT) From: Maxim Cournoyer To: bug-guix Subject: guix offload scheduler/load balancer throttles itself Date: Fri, 02 Oct 2020 23:05:20 -0400 Message-ID: <875z7sm2kv.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: pass client-ip=2607:f8b0:4864:20::735; envelope-from=maxim.cournoyer@gmail.com; helo=mail-qk1-x735.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.7 (/) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) Hello, Guix offload monitors the load of the offload machine, and waits patiently until the load comes back down below a pre-determined level (normalized 2.0 level, it seems) before starting the next build, even when there is a single offload build machine involved. This is inefficient and causes it to throttle itself. Idea of an improvement: it should choose the offload machine with the less load (already the case, I believe), and not block waiting for the load to go down before starting a build. Maxim From debbugs-submit-bounces@debbugs.gnu.org Sat Oct 03 23:21:59 2020 Received: (at 43773) by debbugs.gnu.org; 4 Oct 2020 03:21:59 +0000 Received: from localhost ([127.0.0.1]:44760 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kOub1-0007YA-11 for submit@debbugs.gnu.org; Sat, 03 Oct 2020 23:21:59 -0400 Received: from mail-qk1-f173.google.com ([209.85.222.173]:35247) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kOuaz-0007Xv-38 for 43773@debbugs.gnu.org; Sat, 03 Oct 2020 23:21:57 -0400 Received: by mail-qk1-f173.google.com with SMTP id q5so7894053qkc.2 for <43773@debbugs.gnu.org>; Sat, 03 Oct 2020 20:21:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=oQoDm65dm+y705PRhj2oNjfypDTmTmw1W3krIqv03oY=; b=HytsSg2RwEfiwyldSeKr6EI51V4idAr8gTjkiyqmd9iLPyxdlIp9fu14NK0eST6B2x qaAzQ627BvGEugYk5StlVszT7dNo9IIs0kvpCk8+9Ew7Klh7dMdmq2Fx/5dHS0CAyg8C hqor1wpzTggRJQdAkEhtrrl0jYTwLoNElzCzBWgIx7tDAl5WTeUrgnNhQotfBnE/j5Er EkYg2VX5G9xoXiQilxPMy7JtbU9WcgYHxFPH4ST3xvoYVc6nT8IhLOdbiPaa874kufwN upKb4pa0+nvZaIb2xEQzNKwsOwaiRQbjxxza+ipC/e0MfvYuGbzxj//ms9XKS+VceLez eaGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=oQoDm65dm+y705PRhj2oNjfypDTmTmw1W3krIqv03oY=; b=LatrrMepPedujmnWo9oxnaFJ4PtiJsZPlypvbaYPwDBwtkjAoc6iwENVg4921yiRJH FpFgskHCJiTGJUE0ZvDDylYfmU0gr2o9syavcVEoysGH6pRMG+z4JYBLCRJuoC7zpqTj aJTjkZXKIXl9dyHBE+IMtOHUpw2erGWHwPfiR4/snHRrei8wl6YR1jhrFswnkPNkPRb9 EXrOTU7QrwKmZuTNnilkiXbSOekEd72OYTCohwShEJ05IqR0P15HaEZZsEuwKqttLMGU oxgTUDblBvTbxKrhBYC4n3ije1lqyz64n4Jt7UmqirMS3OMtP2lxSyybsPLuKmaKe2am hf1g== X-Gm-Message-State: AOAM531KivrFdZdnVVmm4SbxLCpy1KNasZhLQG9OaGbTNBoj5Lf/TeTL R+XC+iBT6p9viF7tNu8UrL3M9kC7jRA= X-Google-Smtp-Source: ABdhPJz321IZfVZnGUzDYqIFF1xnTfPFn4vqAlzxa/UVlvKE3G7fnxqGSvKl9NmJWyDB/EJsOp5zxg== X-Received: by 2002:a37:6745:: with SMTP id b66mr1416129qkc.221.1601781711264; Sat, 03 Oct 2020 20:21:51 -0700 (PDT) Received: from localhost.localdomain (dsl-156-63.b2b2c.ca. [66.158.156.63]) by smtp.gmail.com with ESMTPSA id e26sm4622411qka.24.2020.10.03.20.21.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 03 Oct 2020 20:21:50 -0700 (PDT) From: Maxim Cournoyer To: 43773@debbugs.gnu.org Subject: [PATCH] offload: Improve load normalization and configurability. Date: Sat, 3 Oct 2020 23:21:12 -0400 Message-Id: <20201004032112.5916-1-maxim.cournoyer@gmail.com> X-Mailer: git-send-email 2.28.0 MIME-Version: 1.0 Content-Type: text/plain; charset=yes Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 43773 Cc: Maxim Cournoyer X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Fixes . The computed normalized load was previously obtained by dividing the load average as found in /proc/loadavg by the number of parallel builds defined for a build machine. This normalized didn't allow to compare machines with different number of cores, as the load average reported by can be as high as the number of cores; thus comparing that value to a fixed threshold of 2.0 would mean machines with multiple cores were more likely to be flagged as overloaded compared to single core machines. This can be fixed by normalizing using the available number of cores instead of the number of parallel jobs. * guix/scripts/offload.scm ()[overload-threshold]: New field. (node-load): Modify to return a normalized load value between 0 and 1, taking into account the number of cores available. (normalized-load): Remove procedure. (report-load): New procedure. (choose-build-machine): Adjust to use the modified 'node-load' and the new 'report-load' and 'build-machine-overload-threshold' procedures. (check-machine-status): Adjust. * doc/guix.texi (Daemon Offload Setup): Document the offload scheduler and the new 'overload-threshold' field. --- doc/guix.texi | 30 +++++++++++++++++++++- guix/scripts/offload.scm | 54 ++++++++++++++++++++++++---------------- 2 files changed, 62 insertions(+), 22 deletions(-) diff --git a/doc/guix.texi b/doc/guix.texi index a6260a12aa..1d5adbeb63 100644 --- a/doc/guix.texi +++ b/doc/guix.texi @@ -1081,7 +1081,28 @@ architecture natively supports it, via emulation (@pxref{Transparent Emulation with QEMU}), or both. Missing prerequisites for the build are copied over SSH to the target machine, which then proceeds with the build; upon success the output(s) of the build are copied back to the -initial machine. +initial machine. The offload facility comes with a basic scheduler that +attempts to select the best machine. The best machine is chosen among +the available machines based on criteria such as: + +@enumerate +@item +The availability of a build slot. A build machine can have as many +build slots (connections) as the value of the @code{parallel-builds} +field of its @code{build-machine} object. + +@item +Its relative speed, as defined via the @code{speed} field of its +@code{build-machine} object. + +@item +Its load. The normalized machine load must be lower than a threshold +value, configurable via the @code{overload-threshold} field of its +@code{build-machine} object. + +@item +Disk space availability. More than a 100 MiB must be available. +@end enumerate The @file{/etc/guix/machines.scm} file typically looks like this: @@ -1185,6 +1206,13 @@ when transferring files to and from build machines. File name of the Unix-domain socket @command{guix-daemon} is listening to on that machine. +@item @code{overload-threshold} (default: @code{0.6}) +The load threshold above which a potential offload machine is +disregarded by the offload scheduler. The value roughly translates to +the total processor usage of the build machine, ranging from 0.0 (0%) to +1.0 (100%). It can also be disabled by setting +@code{overload-threshold} to @code{#f}. + @item @code{parallel-builds} (default: @code{1}) The number of builds that may run in parallel on the machine. diff --git a/guix/scripts/offload.scm b/guix/scripts/offload.scm index 3dc8ccefcb..a5fe98b675 100644 --- a/guix/scripts/offload.scm +++ b/guix/scripts/offload.scm @@ -88,6 +88,10 @@ (default 3)) (daemon-socket build-machine-daemon-socket ; string (default "/var/guix/daemon-socket/socket")) + ;; A #f value tells the offload scheduler to disregard the load of the build + ;; machine when selecting the best offload machine. + (overload-threshold build-machine-overload-threshold ; inexact real between + (default 0.6)) ; 0.0 and 1.0 | #f (parallel-builds build-machine-parallel-builds ; number (default 1)) (speed build-machine-speed ; inexact real @@ -391,30 +395,34 @@ of free disk space on '~a'~%") (* 100 (expt 2 20))) ;100 MiB (define (node-load node) - "Return the load on NODE. Return +∞ if NODE is misbehaving." + "Return the load on NODE, a normalized value between 0.0 and 1.0. The value +is derived from /proc/loadavg and normalized according to the number of +logical cores available, to give a rough estimation of CPU usage. Return +1.0 (fully loaded) if NODE is misbehaving." (let ((line (inferior-eval '(begin (use-modules (ice-9 rdelim)) (call-with-input-file "/proc/loadavg" read-string)) - node))) - (if (eof-object? line) - +inf.0 ;MACHINE does not respond, so assume it is infinitely loaded + node)) + (ncores (inferior-eval '(begin + (use-modules (ice-9 threads)) + (current-processor-count)) + node))) + (if (or (eof-object? line) (eof-object? ncores)) + 1.0 ;MACHINE does not respond, so assume it is fully loaded (match (string-tokenize line) ((one five fifteen . x) - (string->number one)) + (let ((load (/ (string->number one) ncores))) + (if (> load 1.0) + 1.0 + load))) (x - +inf.0))))) - -(define (normalized-load machine load) - "Divide LOAD by the number of parallel builds of MACHINE." - (if (rational? load) - (let* ((jobs (build-machine-parallel-builds machine)) - (normalized (/ load jobs))) - (format (current-error-port) "load on machine '~a' is ~s\ - (normalized: ~s)~%" - (build-machine-name machine) load normalized) - normalized) - load)) + 1.0))))) + +(define (report-load machine load) + (format (current-error-port) + "normalized load on machine '~a' is ~,2f~%" + (build-machine-name machine) load)) (define (random-seed) (logxor (getpid) (car (gettimeofday)))) @@ -472,11 +480,15 @@ slot (which must later be released with 'release-build-slot'), or #f and #f." (let* ((session (false-if-exception (open-ssh-session best %short-timeout))) (node (and session (remote-inferior session))) - (load (and node (normalized-load best (node-load node)))) + (load (and node (node-load node))) + (threshold (build-machine-overload-threshold best)) (space (and node (node-free-disk-space node)))) + (when load (report-load best load)) (when node (close-inferior node)) (when session (disconnect! session)) - (if (and node (< load 2.) (>= space %minimum-disk-space)) + (if (and node + (or (not threshold) (< load threshold)) + (>= space %minimum-disk-space)) (match others (((machines slots) ...) ;; Release slots from the uninteresting machines. @@ -708,13 +720,13 @@ machine." (free (node-free-disk-space inferior))) (close-inferior inferior) (format #t "~a~% kernel: ~a ~a~% architecture: ~a~%\ - host name: ~a~% normalized load: ~a~% free disk space: ~,2f MiB~%\ + host name: ~a~% normalized load: ~,2f~% free disk space: ~,2f MiB~%\ time difference: ~a s~%" (build-machine-name machine) (utsname:sysname uts) (utsname:release uts) (utsname:machine uts) (utsname:nodename uts) - (normalized-load machine load) + load (/ free (expt 2 20) 1.) (- time now)))))))) -- 2.28.0 From debbugs-submit-bounces@debbugs.gnu.org Sun Oct 04 03:59:31 2020 Received: (at 43773) by debbugs.gnu.org; 4 Oct 2020 07:59:32 +0000 Received: from localhost ([127.0.0.1]:44917 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kOyvb-00075I-J2 for submit@debbugs.gnu.org; Sun, 04 Oct 2020 03:59:31 -0400 Received: from hera.aquilenet.fr ([185.233.100.1]:46970) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kOyvZ-00074s-Ki for 43773@debbugs.gnu.org; Sun, 04 Oct 2020 03:59:30 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id D90FF201; Sun, 4 Oct 2020 09:59:27 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at aquilenet.fr Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vrCpYA66hMwm; Sun, 4 Oct 2020 09:59:27 +0200 (CEST) Received: from jurong (unknown [IPv6:2001:910:103f::3f1]) by hera.aquilenet.fr (Postfix) with ESMTPSA id D3E12D9; Sun, 4 Oct 2020 09:59:26 +0200 (CEST) Date: Sun, 4 Oct 2020 09:59:25 +0200 From: Andreas Enge To: Maxim Cournoyer Subject: Re: bug#43773: [PATCH] offload: Improve load normalization and configurability. Message-ID: <20201004075925.GA1448@jurong> References: <875z7sm2kv.fsf@gmail.com> <20201004032112.5916-1-maxim.cournoyer@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201004032112.5916-1-maxim.cournoyer@gmail.com> X-Spam-Score: 0.7 (/) X-Debbugs-Envelope-To: 43773 Cc: 43773@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.3 (/) Hello Maxim, On Sat, Oct 03, 2020 at 11:21:12PM -0400, Maxim Cournoyer wrote: > Fixes . > > The computed normalized load was previously obtained by dividing the load > average as found in /proc/loadavg by the number of parallel builds defined for > a build machine. > > This can be fixed by normalizing using the available number of cores instead > of the number of parallel jobs. this looks like a good change to me; actually I ended up encoding the number of cores in the "speed" field instead, which is a dirty hack around the core problem. Andreas From debbugs-submit-bounces@debbugs.gnu.org Mon Oct 05 01:38:33 2020 Received: (at control) by debbugs.gnu.org; 5 Oct 2020 05:38:33 +0000 Received: from localhost ([127.0.0.1]:47492 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kPJCj-0006X4-Mg for submit@debbugs.gnu.org; Mon, 05 Oct 2020 01:38:33 -0400 Received: from mail-qk1-f181.google.com ([209.85.222.181]:45971) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kPJCi-0006Wr-0r for control@debbugs.gnu.org; Mon, 05 Oct 2020 01:38:32 -0400 Received: by mail-qk1-f181.google.com with SMTP id o5so10712691qke.12 for ; Sun, 04 Oct 2020 22:38:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:to:from:subject; bh=vbP8zvXdoDG6IfqR/tRVATTZUEWQ7XOVbuY+Rb9jnqk=; b=DXDsyb9SO1hrdO4PnFpz8A/aTG6k8iPgwKQw6jzeScnr4wVVa0Pigmg5yufh0aC+rW ma6hj1MNSS8eHkJ+JVAZwM4ghR0RRcf7d1cYl2fW2i/x9ezZ+Az/ciD53v+LMRW7Ycsp GC4v/D16wl97UZR11WS5JPW6DFTAR6UFCvl1afvF7/H2QyVgz0fmgyUfx14iv1oAjg3i fZO38uWazGq/4Ze5Au5M4mSpgh2ls9A17u7GMUbqyI1h8VzulHEaQvKomN/HYzpHTpKb 8MvTja0ThrbivfX1VWC8VU/cAGgLxBy5jNDidE5il7dR4oJNc4jkqRpR3RL9oEKPPTeW dHYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:to:from:subject; bh=vbP8zvXdoDG6IfqR/tRVATTZUEWQ7XOVbuY+Rb9jnqk=; b=Zo6U7L4kcxnjBR/oWiRHJwzC3EmMyMzwtXKIpMi98h0oIe4u4JptOgSy/rQEtUTw1C MWtwe29I2ezQirZK0O8wQqLKKfOvVgo31rYiZewWoyg77S5qmR1Bkv68t567+WLFFEnp oSlrIl6/H7RJtCQ4XHq6jT3DvWxZSKCwh9uASDn4M6EgDIavWyM9BurrvxgB4mCL2EZE 6S7+yatwN+Q2IK1qVKi/qOy7frR9g0gVzbBIT80a5Eaoq+zPAuS0X4s+2m8flhRVS16B RZl3pyw6/y4DYzzz04CuSt03QfudMedtaLtF8BdfYKLzruTZVdGHrxN3qMV+ku0884Ic yptw== X-Gm-Message-State: AOAM532AfVGuMDmzjQIaBUNe5sRhMy+2/zzkiQuhznbW7iu/yJ1I1MD8 8oU/93TgpkH9QPcpvFyGGoMTGC4uZcxVVw== X-Google-Smtp-Source: ABdhPJycIevEuG+E+cSQZVUCOsxdF3DAOEe7Uynu1k05qQ0Iv+3hSyKmCVHIUIs250LCE/DIzYS8Gg== X-Received: by 2002:a37:474c:: with SMTP id u73mr13424658qka.45.1601876306419; Sun, 04 Oct 2020 22:38:26 -0700 (PDT) Received: from hurd (dsl-156-63.b2b2c.ca. [66.158.156.63]) by smtp.gmail.com with ESMTPSA id z6sm6527221qkl.39.2020.10.04.22.38.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 04 Oct 2020 22:38:25 -0700 (PDT) Date: Mon, 05 Oct 2020 01:38:25 -0400 Message-Id: <87r1qdjkq6.fsf@gmail.com> To: control@debbugs.gnu.org From: Maxim Cournoyer Subject: control message for bug #43773 X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) tags 43773 + patch quit From debbugs-submit-bounces@debbugs.gnu.org Mon Oct 05 10:06:40 2020 Received: (at 43773) by debbugs.gnu.org; 5 Oct 2020 14:06:40 +0000 Received: from localhost ([127.0.0.1]:49981 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kPR8S-0007SS-4d for submit@debbugs.gnu.org; Mon, 05 Oct 2020 10:06:40 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50668) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kPR8Q-0007SF-6t for 43773@debbugs.gnu.org; Mon, 05 Oct 2020 10:06:38 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:59322) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kPR8K-0001sA-KO; Mon, 05 Oct 2020 10:06:32 -0400 Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (port=35334 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kPR82-0005tL-4d; Mon, 05 Oct 2020 10:06:25 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Maxim Cournoyer Subject: Re: bug#43773: [PATCH] offload: Improve load normalization and configurability. References: <875z7sm2kv.fsf@gmail.com> <20201004032112.5916-1-maxim.cournoyer@gmail.com> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 14 =?utf-8?Q?Vend=C3=A9miaire?= an 229 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Mon, 05 Oct 2020 16:06:09 +0200 In-Reply-To: <20201004032112.5916-1-maxim.cournoyer@gmail.com> (Maxim Cournoyer's message of "Sat, 3 Oct 2020 23:21:12 -0400") Message-ID: <87tuv8wywe.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 8bit X-Spam-Score: -2.1 (--) X-Debbugs-Envelope-To: 43773 Cc: 43773@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.1 (---) Hi, Maxim Cournoyer skribis: > Fixes . > > The computed normalized load was previously obtained by dividing the load > average as found in /proc/loadavg by the number of parallel builds defined for > a build machine. > > This normalized didn't allow to compare machines with different number of ^ > cores, as the load average reported by can be as high as the number of cores; ^ Missing words. > thus comparing that value to a fixed threshold of 2.0 would mean machines with > multiple cores were more likely to be flagged as overloaded compared to single > core machines. > > This can be fixed by normalizing using the available number of cores instead > of the number of parallel jobs. Indeed, good catch! > * guix/scripts/offload.scm ()[overload-threshold]: New field. > (node-load): Modify to return a normalized load value between 0 and 1, taking > into account the number of cores available. > (normalized-load): Remove procedure. > (report-load): New procedure. > (choose-build-machine): Adjust to use the modified 'node-load' and the new > 'report-load' and 'build-machine-overload-threshold' procedures. > (check-machine-status): Adjust. > * doc/guix.texi (Daemon Offload Setup): Document the offload scheduler and the > new 'overload-threshold' field. > > doc/guix.texi | 30 +++++++++++++++++++++- > guix/scripts/offload.scm | 54 ++++++++++++++++++++++++---------------- > 2 files changed, 62 insertions(+), 22 deletions(-) Nice. [...] > (define (node-load node) > - "Return the load on NODE. Return +ˆ if NODE is misbehaving." > + "Return the load on NODE, a normalized value between 0.0 and 1.0. The value > +is derived from /proc/loadavg and normalized according to the number of > +logical cores available, to give a rough estimation of CPU usage. Return > +1.0 (fully loaded) if NODE is misbehaving." > (let ((line (inferior-eval '(begin > (use-modules (ice-9 rdelim)) > (call-with-input-file "/proc/loadavg" > read-string)) > - node))) > - (if (eof-object? line) > - +inf.0 ;MACHINE does not respond, so assume it is infinitely loaded > + node)) > + (ncores (inferior-eval '(begin > + (use-modules (ice-9 threads)) > + (current-processor-count)) > + node))) > + (if (or (eof-object? line) (eof-object? ncores)) > + 1.0 ;MACHINE does not respond, so assume it is fully loaded Returning 1.0 now is akin to returning + before, meaning that the machine will never be picked up, is that right? What if one sets overload-threshold = 1.0, the machine would still be picked up, no? > + (if (and node > + (or (not threshold) (< load threshold)) I think we can assume that THRESHOLD is always a number, including possible +inf.0. Thanks, Ludo. From debbugs-submit-bounces@debbugs.gnu.org Mon Oct 05 13:07:29 2020 Received: (at 43773) by debbugs.gnu.org; 5 Oct 2020 17:07:29 +0000 Received: from localhost ([127.0.0.1]:50414 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kPTxQ-0005sn-Qq for submit@debbugs.gnu.org; Mon, 05 Oct 2020 13:07:29 -0400 Received: from mail-qk1-f194.google.com ([209.85.222.194]:46545) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kPTxP-0005sZ-Hu for 43773@debbugs.gnu.org; Mon, 05 Oct 2020 13:07:28 -0400 Received: by mail-qk1-f194.google.com with SMTP id f142so12768539qke.13 for <43773@debbugs.gnu.org>; Mon, 05 Oct 2020 10:07:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-transfer-encoding; bh=3c3tR2puZ5g+em6cSl7IKc11eRSZoVHFy6HFfrfd2Lw=; b=BRJRmnJOHP4y9dSMQGpne+gQEln9ghVk8xtiHRsKPOVLKasvWajavqxnOzFxrXeDjB F2F912jAxwwEhoi2oMLnUOgTaYo7YlCffONLlbvTQKpkmJxZlcmM4SAUlMbg9ARCx8BX hRJpZKR7P7Z0b9Tpcyla6esJ9d1GAkdZk8ZNVExvDIj6xWeFzzslZlEJvCFTXVtlx64X objInmNydcDIEWjstg5+It/mAhCSJvyo0CYLkpW8CGjv1D0YbeNqEnVS6UGJGcLq/QAL 2LK4m3GebjXZ0FDUooGYUJ6VwEj/wQQgvis3VIVx17ZlrRfsVA3/lQ/vbaCdRCr0QBBI IhZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version:content-transfer-encoding; bh=3c3tR2puZ5g+em6cSl7IKc11eRSZoVHFy6HFfrfd2Lw=; b=nkQpodrD9e6sy3+XAxkI/jQabUmaNFvpSzoQ0DnTX/qboYyvlKrf1JTaBNPPSzShWp XvoA1cqE0yQfC3n4Wz6KFP3lfvNGCeIE8u2w0ZMDEVYuoSBuOhp/X0fy6B3ZxEV8YYbu 2VwWKslr521+4UBxfq4RGDo5deRKES7HFobDDpg+JVWI/7GOarOhAjViPDOVWO34mlGG fp4sDvhZjBiWExSoMvZt7BO5SYpAEgC5QjFc4Ife5K5FVQRKucMmnNhrHE6FPPKp3ByT ZeHKiDC9UhO9AzERhYHpl480pU4ninY8OHBvEyIfsXqBxmPy65iT+lwv3zVlg1jQ03sE 9kSQ== X-Gm-Message-State: AOAM531eo10S1l+aowxe1PMIXLWteNWSh3uFaCe/ngZW3ITf4+ZvWbco PStIGcOT9mMIBtXrkgrmXVh7UzPPEjomFQ== X-Google-Smtp-Source: ABdhPJzL9QJJyT0tgAvJZX3kCCzspNAUFhBrfUPzQzpDlsW7PQWPPK+1j/EF0ZH7DIMsDRfX1C/R8Q== X-Received: by 2002:a37:9f86:: with SMTP id i128mr986774qke.147.1601917641709; Mon, 05 Oct 2020 10:07:21 -0700 (PDT) Received: from hurd (dsl-10-141-84.b2b2c.ca. [72.10.141.84]) by smtp.gmail.com with ESMTPSA id j9sm245208qtk.89.2020.10.05.10.07.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Oct 2020 10:07:20 -0700 (PDT) From: Maxim Cournoyer To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: bug#43773: [PATCH] offload: Improve load normalization and configurability. References: <875z7sm2kv.fsf@gmail.com> <20201004032112.5916-1-maxim.cournoyer@gmail.com> <87tuv8wywe.fsf@gnu.org> Date: Mon, 05 Oct 2020 13:07:19 -0400 In-Reply-To: <87tuv8wywe.fsf@gnu.org> ("Ludovic =?utf-8?Q?Court=C3=A8s=22'?= =?utf-8?Q?s?= message of "Mon, 05 Oct 2020 16:06:09 +0200") Message-ID: <87d01wk3eg.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 43773 Cc: 43773@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hello, Ludovic Court=C3=A8s writes: > Hi, > > Maxim Cournoyer skribis: > >> Fixes . >> >> The computed normalized load was previously obtained by dividing the load >> average as found in /proc/loadavg by the number of parallel builds defin= ed for >> a build machine. >> >> This normalized didn't allow to compare machines with different number of > ^ > >> cores, as the load average reported by can be as high as the number of c= ores; > ^ > Missing words. Good catch, fixed. [...] >> (define (node-load node) >> - "Return the load on NODE. Return + if NODE is misbehaving." >> + "Return the load on NODE, a normalized value between 0.0 and 1.0. Th= e value >> +is derived from /proc/loadavg and normalized according to the number of >> +logical cores available, to give a rough estimation of CPU usage. Retu= rn >> +1.0 (fully loaded) if NODE is misbehaving." >> (let ((line (inferior-eval '(begin >> (use-modules (ice-9 rdelim)) >> (call-with-input-file "/proc/loadavg" >> read-string)) >> - node))) >> - (if (eof-object? line) >> - +inf.0 ;MACHINE does not respond, so assume it is infinitely lo= aded >> + node)) >> + (ncores (inferior-eval '(begin >> + (use-modules (ice-9 threads)) >> + (current-processor-count)) >> + node))) >> + (if (or (eof-object? line) (eof-object? ncores)) >> + 1.0 ;MACHINE does not respond, so assume it is fully loaded > > Returning 1.0 now is akin to returning + before, meaning that the > machine will never be picked up, is that right? Yes, 1.0 has the same meaning as the +inf.0 value previously used (i.e., the machine is fully loaded). > What if one sets overload-threshold =3D 1.0, the machine would still be > picked up, no? Currently no, the machine would never be picked up, as the maximum value returned by node-load is now 1.0, and the comparison is using the strictly inferior to operator (<). Perhaps this should be made a <=3D operator? >> + (if (and node >> + (or (not threshold) (< load threshold)) > > I think we can assume that THRESHOLD is always a number, including > possible +inf.0. It's no longer possible for node-load to return +inf.0; it's strictly bound between 0.0 and 1.0. The check for #f is done because it is desirable (for semantic clarity) to allow the user to disable overload-threshold altogether by setting it to #f. This is documented. Thanks! Maxim From debbugs-submit-bounces@debbugs.gnu.org Mon Oct 05 17:00:54 2020 Received: (at 43773) by debbugs.gnu.org; 5 Oct 2020 21:00:54 +0000 Received: from localhost ([127.0.0.1]:50668 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kPXbK-0007LF-GR for submit@debbugs.gnu.org; Mon, 05 Oct 2020 17:00:54 -0400 Received: from mail-qk1-f179.google.com ([209.85.222.179]:35407) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kPXbI-0007L1-Tq for 43773@debbugs.gnu.org; Mon, 05 Oct 2020 17:00:53 -0400 Received: by mail-qk1-f179.google.com with SMTP id q5so13919833qkc.2 for <43773@debbugs.gnu.org>; Mon, 05 Oct 2020 14:00:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GodaXbWK94283aZGSqTJySUj7ZjzLHZHhvaDNa8bg34=; b=Q3pbfYn/giCPwnCAsbHZBp49Y+RwJOcd7YD0cKAeA1oqgnL6+uEFaCDZAEb5P+bgY4 1B5AAdt0GmQppg1ylqqB/IJbBffs6tV84iOamkmG/vls32VqTpFqFrRBC6IgG2JLqVgI vrbP2hbgrEKsYvEEtvTwFUkjQVpkLBdTaF2ZuRgtC/Jsvo41WWOyRnvsRukhM37EOyAz ummAw3ceSid7MK/7lDbLbDfZri67Ww0S31ayON4P1iLNv0MakqisRk6+38/9uBtcFoLR uccgMFxOb3wrmLcxvJ2LvnDBzDLMmswGNtu404rvSrbpDdxcAprey+kPfGhlAtAudOet gdSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GodaXbWK94283aZGSqTJySUj7ZjzLHZHhvaDNa8bg34=; b=DPZuep1LniovhgU2GhrihwZpZCqAErYA+gIqe3a8nk9xcxz6RbMWXVnt1TdQ+U6n/a oAZ1JJUAE83ETSOsCWlairepuQH+2FrnAcxGOpP3vu0JUhH8uyXwfqnE4DQ8/AwyPZru YL3NC0RbxZ6D15RAMmE6/9cilznZtNW5QTB9IaXlCgxI1JzXGvqfrvWFOpW96ptwTXwJ no7Hvsl4Q4zgtOckk+h13Ypg7gpZoYzrsLoEtpB16WQH7hxzNeVNM1Sd1VEcdyzhsWoB 4WxqhiHmQ8tS16P+kCi3M/Da9nhJ2fq3lHP5OudMpIQ5US906JHx1hR6I9DFtY9hVE/f gKoA== X-Gm-Message-State: AOAM531Zssda2pqguz5lM21rVA4NAdzyxfHY9hGVAW7xQ7Z1B8S/CSOe OISQu9SurFr3cSoVHEYljEbrDt0YuRInh9ORVso= X-Google-Smtp-Source: ABdhPJxmhB0r+Ihds0Jzd7Vd6/P+ePJo8OzHhgPFvAoN0Ke3Bc4725rUh672WN5qItx8PJLxQJtYIeIVgn/Wvoh00Os= X-Received: by 2002:a37:a2cd:: with SMTP id l196mr1858242qke.201.1601931647100; Mon, 05 Oct 2020 14:00:47 -0700 (PDT) MIME-Version: 1.0 References: <875z7sm2kv.fsf@gmail.com> In-Reply-To: <875z7sm2kv.fsf@gmail.com> From: zimoun Date: Mon, 5 Oct 2020 23:00:35 +0200 Message-ID: Subject: Re: bug#43773: guix offload scheduler/load balancer throttles itself To: Maxim Cournoyer Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 43773 Cc: 43773@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hi Maxim, On Sat, 3 Oct 2020 at 05:04, Maxim Cournoyer wrote: > Idea of an improvement: it should choose the offload machine with the > less load (already the case, I believe), and not block waiting for the > load to go down before starting a build. I have never looked at this: schedule an offloading strategy. And for example, I do not even know what is the current one. However, is it not reinventing the wheel? I mean, there are "well-know" job schedulers dealing with various constraints that we could "reimplement" instead of trying "ours". Well, my remark is fully naive, I do not know. :-) All the best, simon From debbugs-submit-bounces@debbugs.gnu.org Mon Oct 05 23:44:34 2020 Received: (at 43773) by debbugs.gnu.org; 6 Oct 2020 03:44:34 +0000 Received: from localhost ([127.0.0.1]:51017 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kPdtx-0000MK-PV for submit@debbugs.gnu.org; Mon, 05 Oct 2020 23:44:33 -0400 Received: from mail-qk1-f170.google.com ([209.85.222.170]:41569) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kPdtw-0000M8-Hi for 43773@debbugs.gnu.org; Mon, 05 Oct 2020 23:44:33 -0400 Received: by mail-qk1-f170.google.com with SMTP id b69so8648394qkg.8 for <43773@debbugs.gnu.org>; Mon, 05 Oct 2020 20:44:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=07wq6swliklSrtA+lTOABndIRnkTrPV2g9PX8EVxtvo=; b=a9FQKToS2k83mkncdAYO8PwUD4MbeiQPA4dx/zaJ+G2fxWTlNttVJ9+NQp0ucH/Z8b wupHsF/FyoFQkHWmTUDp7S5Yb5/8Nnqwu+CD+0S+4YflZAs/RiUgzG8NZOFwHwX4au9H o0mBCtKV3SrPcEjcMBGcb4+rZQttJqLjQbKC2bGyznM++TPQm/902BTBGpJX8uGHybyX HFWCIzjgsEZJHoaMhOcApekEfbrYLLRcCfxGqf6OhUoN7XiMYtEisRrPkxt3dp1O/sui tnyynK1051P4VXmlUUZfE6+YG8cC6ou8c6WodMmlLhNXcP4BcGgReh5HpctUv7UskHlm GEbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=07wq6swliklSrtA+lTOABndIRnkTrPV2g9PX8EVxtvo=; b=PiitJFGeGHwk9VK/93K+FkFuBzUz5a2oU8mBtqWm6CuqrTrVrt/KJJQpfnWqbuuKwL 6ft7ZYX0AQGhy7IsHneaXoCJy9dUPDHUuFJX2qFdY8OlinkRwdfpAWW9R2pHU/bc1rWx WKFNBBW3JkRL6Ako6Oa/vCyBvEGmqJ0dECj3mm7kKGbzV5JX3J+Uj68AWtEzpmiIeO7I zihtSHhpoMw8LnSB86rsobd+yVOfovAMSm16Eqm+RluW+VD49BtL+TZvEQ4SO1sZ6qFC YVAmRdv7Z2w1JhZST+OOaklkdjBiKbFrQaykFqh1IcHcos4bnQBG7dyjGsGg5SsxMmMd kq4A== X-Gm-Message-State: AOAM533zWk0I8GkVJHKPRnxQ9B7bVQTiImsbKwFDsBxuvrR/UHJOEf01 MM5gVGB0PSyyBj5KUulK5sFfhOLuRbp3cQ== X-Google-Smtp-Source: ABdhPJyIjY+2EpQzQrJSB4q/ydMHhHNUytsj24ha7pxvkfiUt1LDyXwVURmNoN1LY/lBRYPQzVwcmA== X-Received: by 2002:a37:7c3:: with SMTP id 186mr704884qkh.417.1601955866649; Mon, 05 Oct 2020 20:44:26 -0700 (PDT) Received: from hurd (dsl-10-141-84.b2b2c.ca. [72.10.141.84]) by smtp.gmail.com with ESMTPSA id k22sm1593777qkk.13.2020.10.05.20.44.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Oct 2020 20:44:25 -0700 (PDT) From: Maxim Cournoyer To: zimoun Subject: Re: bug#43773: guix offload scheduler/load balancer throttles itself References: <875z7sm2kv.fsf@gmail.com> Date: Mon, 05 Oct 2020 23:44:24 -0400 In-Reply-To: (zimoun's message of "Mon, 5 Oct 2020 23:00:35 +0200") Message-ID: <87imbohvc7.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 43773 Cc: 43773@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hi, zimoun writes: > Hi Maxim, > > On Sat, 3 Oct 2020 at 05:04, Maxim Cournoyer wrote: > >> Idea of an improvement: it should choose the offload machine with the >> less load (already the case, I believe), and not block waiting for the >> load to go down before starting a build. > > I have never looked at this: schedule an offloading strategy. And for > example, I do not even know what is the current one. However, is it > not reinventing the wheel? I mean, there are "well-know" job > schedulers dealing with various constraints that we could > "reimplement" instead of trying "ours". Well, my remark is fully > naive, I do not know. :-) I tried to get inspiration from Jenkins's sources, but I failed to locate it. The patch posted here ended up fixing the normalized load and making it configurable. It reuses the existing (very simple) scheduling scheme. I've summarily documented it in the patch if you are curious. Maxim From debbugs-submit-bounces@debbugs.gnu.org Thu Oct 08 11:05:06 2020 Received: (at 43773-done) by debbugs.gnu.org; 8 Oct 2020 15:05:06 +0000 Received: from localhost ([127.0.0.1]:32895 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kQXTd-0008Ax-P2 for submit@debbugs.gnu.org; Thu, 08 Oct 2020 11:05:05 -0400 Received: from mail-qk1-f172.google.com ([209.85.222.172]:46125) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kQXTb-0008AN-9N for 43773-done@debbugs.gnu.org; Thu, 08 Oct 2020 11:05:04 -0400 Received: by mail-qk1-f172.google.com with SMTP id a23so7261183qkg.13 for <43773-done@debbugs.gnu.org>; Thu, 08 Oct 2020 08:05:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=jSOG2ezowMqHUkZMmIHe7Lu26h/1ofKpfIIaojvSBcw=; b=XdnDYIOZDrpkYepW3mYKyFEt40OJmAxbFtGcgTQ91s7L7R7q/QHS4TrdBAVXmajCCT ZnKAUf31Qm8rTw4whMKErC8rFGQ5bbjPUpjMUtcYhNf3geyJVUuOFZTxHmsg481HswE9 K1tNXvcqNS4tHVb4baZvowiPM0t5Ayk2DoqC4IUEPjoaJpXp+DeMYOaB3SmyAxT/1Y/1 RYUueBrEg7jJO02YCJrfQ3+miUcYCPDYEJU4JxT1HY6jNKxteA8MsCRS3SwKRrgFEYCl rr1bd2o85M4iqs9ezfVI3EnAHQ0Px1tQ26Im1lcyaOKsjA3DjLcg4QICv8644vzgdPPk Z8kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=jSOG2ezowMqHUkZMmIHe7Lu26h/1ofKpfIIaojvSBcw=; b=Id8sdOO+azhyyWEjAprfR4cvO+vqwQ/qA+RUIKuJvmsouwodvBNICTytKcwzawE4iu K0LCXvGty9LmYKMvVQLo+gqq3NuvhnDMBeSmXMxkyaoU+Mtn3glGf6ndZt/IDSVWjeo9 4S4oKRUxmxt5dopZnOXwC+iopVt1hcBwTfB11rl10CBInVRfSMYnB0zMBXKwiR7wzfH3 mkJzwhglpJFohZyimnTzZOjCJ+vgZEVcJR23cxVgHLAk4C9eb6crmcQqwD6StbUvgITJ chEAjSkw3v/OMlCqUDVJUGM67YCiBJcJ9IvUsaaTrGaHmaamgza7zDYVO9dK0rgCTjWI 16tQ== X-Gm-Message-State: AOAM533fMf1VbrQ33Az5eMge19KQvK/XqSls2JsQDW32VUVyJhUbpyct qZvPGd52kX8XiB86HSu11eTkOr5+Pdk= X-Google-Smtp-Source: ABdhPJzUFpdHhPi3mzi7J4TMBgqAOtYXy+GBrP5+U8BUEjZou2UxmnWbl8oFPOBSXEklI1qRn/AT5A== X-Received: by 2002:a37:478c:: with SMTP id u134mr8887553qka.206.1602169497343; Thu, 08 Oct 2020 08:04:57 -0700 (PDT) Received: from hurd (dsl-10-135-56.b2b2c.ca. [72.10.135.56]) by smtp.gmail.com with ESMTPSA id p69sm3867220qka.5.2020.10.08.08.04.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Oct 2020 08:04:55 -0700 (PDT) From: Maxim Cournoyer To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: bug#43773: [PATCH] offload: Improve load normalization and configurability. References: <875z7sm2kv.fsf@gmail.com> <20201004032112.5916-1-maxim.cournoyer@gmail.com> <87tuv8wywe.fsf@gnu.org> <87d01wk3eg.fsf@gmail.com> Date: Thu, 08 Oct 2020 11:04:55 -0400 In-Reply-To: <87d01wk3eg.fsf@gmail.com> (Maxim Cournoyer's message of "Mon, 05 Oct 2020 13:07:19 -0400") Message-ID: <871ri8vjvs.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 43773-done Cc: 43773-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hello, I went ahead and pushed this change with commit efbf5fdd01817ea75de369e3dd2761a85f8f7dd5. Thank you! Maxim From unknown Fri Jun 20 05:30:24 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 06 Nov 2020 12:24:08 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator