2020-04-13T00:37:59 -heroes-bot- PROBLEM: MySQL WSREP recv on galera3.infra.opensuse.org - CRIT wsrep_local_recv_queue_avg = 1.100019 ; See https://monitor.opensuse.org/icinga/cgi-bin/extinfo.cgi?type=2&host=galera3.infra.opensuse.org&service=MySQL%20WSREP%20recv 2020-04-13T02:43:46 *** okurz_ is now known as okurz 2020-04-13T05:57:18 -heroes-bot- PROBLEM: MySQL backup on mybackup.infra.opensuse.org - CRITICAL: No results from backup job ; See https://monitor.opensuse.org/icinga/cgi-bin/extinfo.cgi?type=2&host=mybackup.infra.opensuse.org&service=MySQL%20backup 2020-04-13T10:49:47 cboltz: nice work on the pipeline... I was not able to make it run without disabling everything, I have closed my MR 2020-04-13T10:50:02 working now in the sudo problem 2020-04-13T10:50:19 :-) 2020-04-13T12:03:37 looks like someone accidently deleted https://progress.opensuse.org/issues/65549 (about deploy_job / sudo) 2020-04-13T12:04:20 klein: should I re-create it, or can/will you fix the sudo issue without having that ticket? ;-) 2020-04-13T12:34:54 -heroes-bot- PROBLEM: PSQL locks on mirrordb1.infra.opensuse.org - POSTGRES_LOCKS CRITICAL: DB postgres total locks: 55 * total waiting locks: 2 ; See https://monitor.opensuse.org/icinga/cgi-bin/extinfo.cgi?type=2&host=mirrordb1.infra.opensuse.org&service=PSQL%20locks 2020-04-13T12:44:54 -heroes-bot- RECOVERY: PSQL locks on mirrordb1.infra.opensuse.org - POSTGRES_LOCKS OK: DB postgres total=42 ; See https://monitor.opensuse.org/icinga/cgi-bin/extinfo.cgi?type=2&host=mirrordb1.infra.opensuse.org&service=PSQL%20locks 2020-04-13T13:16:25 -heroes-bot- PROBLEM: NRPE on svn.infra.opensuse.org - connect to address 192.168.47.25 port 5666: No route to host ; See https://monitor.opensuse.org/icinga/cgi-bin/extinfo.cgi?type=2&host=svn.infra.opensuse.org&service=NRPE 2020-04-13T13:56:23 klein: ticket re-created as https://progress.opensuse.org/issues/65585 2020-04-13T13:57:07 (without your comment about the revoked MR) 2020-04-13T14:46:27 pjessen: kl_eisbaer which one of you would be better to bother about pgsql server? 2020-04-13T15:21:12 cboltz: sorry, I was trying to delete only my comment :-( 2020-04-13T15:28:45 no problem, that's why we have notification/backup mails ;-) 2020-04-13T15:37:53 and, about the sudo problem... I bet that the salt-call happens inside the container, the only thing we need is install sudo inside the container... 2020-04-13T15:38:22 what have you done in the runner1? 2020-04-13T15:39:41 ohh... I see you gave sudo everything to gitlab-runner.... I don't like that 2020-04-13T15:42:19 I didn't add the sudo rules - wild guess: maybe Lars did? 2020-04-13T15:42:58 actually I can't even login to the gitlab-runner* VMs, they probably never got the initial highstate 2020-04-13T15:43:09 (but I can access them via the salt "backdoor" ;-) 2020-04-13T15:44:27 the only thing I tried was (via salt cmd.run) a (IIRC) zypper in -f sudo on gitlab-runner2, but that didn't fix the "sudo: command not found" 2020-04-13T15:45:15 (+ some read-only actions like rpm -q sudo and rpm -V sudo via salt on both gitlab-runners) 2020-04-13T15:45:34 yes... I tried that too, the problem is that the salt-call is executed inside the running container 2020-04-13T15:46:45 are you sure? .gitlab.ci.yml has "tags: shell" for deploy_job - and "tags: docker" for all the tests 2020-04-13T15:48:28 hummm... didn't noticed that 2020-04-13T15:49:15 but, if you have installed sudo and still have the "command not found" that points me to the container 2020-04-13T15:50:06 https://gitlab.infra.opensuse.org/infra/salt/-/merge_requests/376 2020-04-13T15:50:06 sudo was installed before (with the locale files missing) 2020-04-13T15:50:17 I did the zypper in -f in the hope to "repair" it, but no change 2020-04-13T15:50:29 let's see how this works 2020-04-13T15:51:23 yes, it's worth a try - even if I wonder why it only fails on gitlab-runner2... 2020-04-13T15:54:04 silly question: is the handling of "tags: shell" / "tags: docker" something that gets configured on gitlab-runner*? If yes, did you compare the config for this on the two servers? 2020-04-13T15:55:04 lol, I was doing that right now 2020-04-13T15:55:34 another sidenote - my understanding of .gitlab-ci is that prepare_test_env.sh does _not_ run for deploy_job 2020-04-13T15:56:00 (which makes it less likely that !376 will help) 2020-04-13T15:56:23 and they are indeed different, gitlab-runner1 has only a "shell" runner, and gitlab-runner2 has only a "docker" runner 2020-04-13T15:57:03 my guess is that because of that, everything is run inside the container created by the test, and, because of that, it will reuse the installed sudo 2020-04-13T15:58:06 so - configure both shell and docker runner on both servers? 2020-04-13T15:58:56 or always preffer to have only the docker runner, and, ask everyone to install deps inside their running transient container instead of having a lot of unecessary things on the runner OS? 2020-04-13T16:00:33 might be an option, but then you'll need a way to connect to salt from inside a container - which sounds equally scary ;-) 2020-04-13T16:02:25 actually, the salt-call is executed using an API call with password (that is in the CI variables/secrets) 2020-04-13T16:03:10 yes, I know - but to be able to do that, the minion needs to be registered with the saltmaster, and that's difficult for a freshly created container ;-) 2020-04-13T16:03:56 I don't think we need to be a registered minion to be able to run salt-call 2020-04-13T16:04:57 I am wondering how this was running with the containers in caasp 2020-04-13T16:06:07 no idea, my _guess_ is that deploy_job was running directly on the server, not in a container 2020-04-13T16:06:47 on caasp machines, I don't believe... I will try to find a way... 2020-04-13T16:10:53 cboltz: let's merge my PR and see it in action? 2020-04-13T16:11:34 I have some doubts that it will work, but it's worth a try ;-) (and in worst case easy to revert) 2020-04-13T16:12:32 as a sidenote - your guess that deploy_job was run with docker on gitlab2-runner seems to be right: https://gitlab.infra.opensuse.org/infra/salt/-/jobs/8171 "Using Docker executor" 2020-04-13T16:13:01 while https://gitlab.infra.opensuse.org/infra/salt/-/jobs/8172 on gitlab-runner1 says "Using Shell executor" 2020-04-13T16:21:06 yup 2020-04-13T16:21:32 just need to know now if the salt-call will go to the correct master 2020-04-13T16:23:22 that's easy to find out - systemctl stop salt-master on minnie for some minutes, and check if the CI complains ;-) 2020-04-13T16:25:01 for gitlab-runner1 with the shell runner, I'm quite sure it will use minnie 2020-04-13T16:25:37 for gitlab-runner2 and its docker runner, I have no idea 2020-04-13T16:27:20 (I'd _guess_ that it won't be able to reach minnie via the minion running outside of the container) 2020-04-13T17:36:34 interesting... it failed, even with sudo installed (and configured to allow salt-call for gitlab-runner)... 2020-04-13T17:36:57 but anyway, it fails with the same error "command not found", I will check that again later, need to finish other tasks here 2020-04-13T17:44:26 I'd guess the best solution is to have a shell runner on both servers 2020-04-13T17:46:42 I will try exactly that 2020-04-13T17:47:16 :-) 2020-04-13T20:00:08 -heroes-bot- PROBLEM: HAProxy on provo-proxy1.infra.opensuse.org - HAPROXY CRITICAL - Active service debuginfod is DOWN on debuginfod proxy ! ; See https://monitor.opensuse.org/icinga/cgi-bin/extinfo.cgi?type=2&host=provo-proxy1.infra.opensuse.org&service=HAProxy