2022-12-20T00:06:22 hellcp: can you please have a look at https://gitlab.infra.opensuse.org/infra/salt/-/merge_requests/600 (including the CI results)? 2022-12-20T01:43:18 I'm seeing mirrorcache-us-db.infra.opensuse.org bleeding out in Icinga, and user reporting mirrorcache-us issues. 2022-12-20T01:43:57 It's unpingable! 2022-12-20T01:46:19 Funny thing is that mirrorcache-br is unpingable too. 2022-12-20T01:53:05 Forum user reports issues as well, but download.o.o 2022-12-20T01:54:07 I don't see mirrocache-us-db.i.o.o bleeding out anymore. 2022-12-20T01:54:54 Luciano[m], if browse to the web url it's all funky 2022-12-20T01:55:16 Well, download.o.o can redirect to mirrorcache-us. 2022-12-20T01:56:13 malcolmlewis: I lied, it still bleeding... 2022-12-20T02:01:04 Hmm, I see login3, status2, nala2 are also down. Judging by their IP, they seem to be all in the US too. 2022-12-20T02:09:51 Confirmed, they're all in Provo. 2022-12-20T02:34:05 Retrieving repository 'UpdateSLE' metadata ......................................................................[error] Repository 'UpdateSLE' is invalid. [UpdateSLE|http://download.opensuse.org/update/leap/15.4/sle/] Valid metadata not found at specified URL History: - Timeout exceeded when accessing 'http://download.opensuse.org/update/leap/15.4/sle/repodata/bf64bea6a75659926ef4e6228796191747ab2a1f36cd67de5b4eacb37fcdbc01-deltainfo.xml.gz'. - 2022-12-20T02:34:05 Can't provide ./repodata/bf64bea6a75659926ef4e6228796191747ab2a1f36cd67de5b4eacb37fcdbc01-deltainfo.xml.gz ~ 2022-12-20T02:42:29 Yeah, there seems to be some sort of outage in Provo, affecting users in the US and surrounding regions. 2022-12-20T04:02:02 Forums have gone foobar as well.... a bust place... 2022-12-20T04:02:08 Forums have gone foobar as well.... a busy place... 2022-12-20T04:02:44 Luciano[m], ^^ 2022-12-20T04:06:08 (╯°□°)╯︵ ┻━┻ What! 2022-12-20T04:13:06 I see a "APACHE_STATUS UNKNOWN - 500 Can't connect to forum.i.o.o:xy (Connection refused)", since Dec 12. Not that new, I use the forums yesterday. 🤔 2022-12-20T04:13:27 s/use/used 2022-12-20T04:13:28 using mirror.mia11.us.leaseweb.net for SLE instead of d.o.o. works 2022-12-20T04:15:51 Yes, you can use mirrors directly. 2022-12-20T04:16:38 browser plays dumb when I try to goto forums 2022-12-20T04:18:06 * Luciano[m] nods! 2022-12-20T04:27:17 something funky with progress tickets as well... 2022-12-20T04:30:28 I'm not experiencing it. 2022-12-20T04:32:19 I get emails, but don't see them appear in progress like they use to 2022-12-20T04:35:02 I deleted quite some spams today/yesterday. Could that be it? 2022-12-20T04:40:19 Luciano[m], nope, just got two, both about mirrors, not to be seen, plus some account ones that should be there 2022-12-20T04:43:55 Hmm, I can see they are there. 2022-12-20T05:21:55 yeah in provo mirrorcache-us cannot ping mirrorcache-us-db : no route to host. Hope somebody will help with it 2022-12-20T05:28:35 anikitin: Not only mirrorchace-us-db. login3, status2, nala2 all went down as well. Somebody should contact the SUSE Office in Provo to know what's going on, I suppose. 2022-12-20T07:45:03 Good morning 2022-12-20T07:45:23 one KVM host is down, I am starting VM's elsewhere 2022-12-20T07:45:41 Provo people will be asleep at this time of day. 2022-12-20T07:46:35 I started all except for mirrorcache-us-db - I cannot find this one in RackTables and none of the xml VM names match. any ideas? 2022-12-20T07:49:03 hm, it was called something like galera-cluster-3 before, ip is 192.168.67.23 , what other info can help? 2022-12-20T07:49:31 ah, galera I can find :) 2022-12-20T07:50:00 lars did rename it for me, maybe he forgot some step 2022-12-20T07:50:02 started as well now 2022-12-20T07:50:44 well as provo-galera3 I can find it in RT as well .. but I will add a note that the hostname is mirrorcache-us-db for future reference 2022-12-20T07:51:00 cool all looks good for now, thx! 2022-12-20T07:51:33 sure and thanks for reporting 2022-12-20T07:54:33 my monitoring says, mirrorcache-us.o.o was down 00:38 to 07:51 2022-12-20T07:59:07 that's long :-( 2022-12-20T08:54:50 forum attempts result in blank browser windows 2022-12-20T09:02:12 more like forums return 502 2022-12-20T09:03:25 Dec 20 09:00:56 discourse01.infra.opensuse.org bundler.ruby2.7-2.3.24[10454]: [10454] ! Unable to load application: Errno::ENOSPC: No space left on device - copy_file_range 2022-12-20T09:03:55 /dev/vda4 8.9G 8.6G 0 100% / 2022-12-20T09:05:03 a-865k: poo#122197 2022-12-20T09:06:59 the server run out of space 2022-12-20T09:07:03 yeah 2022-12-20T09:09:13 I see it stores uploads and backups in /srv/www/vhosts/discourse/public. it's not the only thing consumign space but just figuring maybe that should be on a different partitiont 2022-12-20T09:09:55 it is 2022-12-20T09:10:03 look at df 2022-12-20T09:10:05 ah sorry I checked wrong 2022-12-20T10:27:09 Hey, if anyone knows, where can I change an account email address on idp-portal.suse.com ? 2022-12-20T10:27:48 krop: on https://idp-portal.suse.com/univention/self-service/#page=setcontactinformation - and then maybe need to trigger verification as well 2022-12-20T10:31:14 ah... people who created this never heard of UX :) Thanks ! 2022-12-20T13:15:39 *** teepee_ is now known as teepee 2022-12-20T14:01:14 hi anikitin, it seems some users can't reach mirrorcache-us.o.o still, can you check if maybe some machine is still down? https://susepaste.org/22910215 2022-12-20T14:04:57 acidsys, it again cannot reach mirrorcache-us-db 2022-12-20T14:05:12 Destination Host Unreachable 2022-12-20T14:07:06 i cannot ssh there as well 2022-12-20T14:07:14 thanks .. seems bryce1 is now down as well 2022-12-20T14:07:34 so I will start the vm's on bryce3, the last remaining .. US guys should wake up soon to check 2022-12-20T14:08:04 thx, I wonder how easy to identify the cause? Like overload or something unstable? 2022-12-20T14:08:17 snow.... 2022-12-20T14:08:28 thanks acidsys, hope it's not the earthquake in N. California 2022-12-20T14:08:35 no idea, bmwiedemann[m] tried to reach the console but it seems down as well, so we have to wait for onsite people to check 2022-12-20T14:08:44 nope that was on the coast 2022-12-20T14:09:12 acidsys, their on mountain time, another hour maybe 2022-12-20T14:10:24 ok .. mirrorcache-us seems to be working again for now .. let's see if bryce3 will go down as well :b 2022-12-20T14:10:35 yes, mirrorcache-us is working back thx 2022-12-20T14:10:37 morte_ you can try your zypper foo again 2022-12-20T14:11:18 acidsys, working here fine now 2022-12-20T14:11:47 back up for me as well. thanks folks 2022-12-20T14:12:36 thanks for confirming 2022-12-20T14:14:24 acidsys, and more tickets arrived... 2022-12-20T14:16:14 will check soon 2022-12-20T14:17:51 acidsys: working fine, thanks for the help 2022-12-20T14:33:18 I revived the console, but it won't connect to IPMI, so no luck there. well it's almost morning there anyways 2022-12-20T14:36:24 zypper refused to pull openSUSE-releas* a little bit ago. I had to fetch them from mirror.mia11.us.leaseweb.net to finish dup. 2022-12-20T18:47:41 I wonder who's taking care of Cachet - the status.o.o system - now. I'd like, very much, to have an account there to update status when needed. Witnessed several occasions where a simple read flag in the site would calm the nerves of many people, rather than have problems and seeing all lights green, in the site. 2022-12-20T18:48:05 s/have/having 2022-12-20T18:57:52 Luciano[m]: I can make you an account 2022-12-20T18:58:13 but I am not really the one "taking care" of it :D 2022-12-20T18:58:31 cboltz created my account originally 2022-12-20T18:59:01 Well, I appreciate it acidsys nonetheless. Works for me :^P 2022-12-20T18:59:50 My email: luc14n0 opensuse org 2022-12-20T19:03:05 maybe I was overenthusiastic there - my "Teams" page does not have an "Add" button .. maybe I confused the view with my own Cachet 2022-12-20T19:03:14 sorry; so christian I'm sure will do it ;) 2022-12-20T19:05:33 acidsys: Were you able to reach the Cachet login page while Provo was down? 2022-12-20T19:10:58 No sweat, thanks anyway. 2022-12-20T19:11:43 mdogg: I did not check, but there is a status2 VM in Provo which was down as well, 2022-12-20T19:12:19 unrelated, colleague in Provo brought all hypervisors back online 2022-12-20T19:13:10 I was able to, even though I do not have any credentials there, since status1 is in Germany. 2022-12-20T19:14:11 I could have done it, should have pinged me >:D 2022-12-20T19:16:23 * acidsys adds hooking cachet into icinga to unwritten backlog 2022-12-20T19:17:34 create a ticket 2022-12-20T19:18:18 right I should do that 2022-12-20T19:20:07 Jacob Michalskie: Who's the one who needs a ticket, me or acidsys ? Just to clarify. 2022-12-20T19:20:39 the one who thinks there's an unwritten backlog >:D 2022-12-20T19:20:46 there's a very long written backlog instead 2022-12-20T19:23:17 Alright, un-threaded discussions and the confusion it creates... 2022-12-20T19:23:28 s/it/they 2022-12-20T19:31:52 Luciano[m]: invite for status.o.o sent 2022-12-20T19:33:01 (AFAIK there aren't separate permission levels, therefore I'm surprised that acidsys didn't see the invite or add button) 2022-12-20T19:34:18 do we have access to the vm behind it? it would be nice to update it at some point 2022-12-20T19:34:42 I have, and can add ssh keys 2022-12-20T19:35:15 (status.o.o is not connected to salt, which means logging in as root only) 2022-12-20T19:35:25 * Luciano[m] goes take a look around for the invitation! 2022-12-20T19:35:53 Oh, we need a bit of salt there. 2022-12-20T19:36:36 the idea was not to break it if we completely break everything in salt, but I tend to think that salting it isn't too risky 2022-12-20T19:37:46 how does status2 play into this mix? 2022-12-20T19:38:13 Hmm, what's the backup plan if salt goes foobar BTW? 2022-12-20T19:38:18 it's another manually setup VM (in Provo), and there's a cronjob that syncs the database from status1 to status2 once per day 2022-12-20T19:38:35 aha interesting 2022-12-20T19:38:56 Luciano[m]: the saltmaster is completely salted (at least AFAIK), so setting up a new one is easy 2022-12-20T19:40:06 Very nice! 2022-12-20T19:40:39 (that still leaves the annoyance if re-connecting all the minions because the master key isn't backed up in salt, but - there's always something... ;-) 2022-12-20T19:41:31 the whole salt setup is meant to help if everything goes wrong >:D 2022-12-20T19:42:04 shroedingers salt setup: you find out if all your salt code works only if everything goes wrong. 2022-12-20T19:43:41 * Luciano[m] laughs! 2022-12-20T19:47:16 it would be nice to go around infra to do testhighstate and sync up the state from machines 2022-12-20T19:48:37 yeah once go through all ones and fix broken onesa 2022-12-20T19:48:46 afterwards we can do scheduled state applies 2022-12-20T19:50:40 we should also add some automated updates, would be nice to integrate os-update into some shared bit in salt repo 2022-12-20T19:51:46 yep .. it should just report somewhere (icinga or wherever is convenient) if any fail their automatic highstate or os update 2022-12-20T19:54:37 I should create a ticket for migrating the rest of the services to something that isn't login proxies 2022-12-20T19:55:09 it shouldn't be hard to get a list from the dns 2022-12-20T19:56:42 on another note, could we make dale.i.o.o an alias for chip.i.o.o? I have to think twice every time I need to log in there which one of the chipmunks it was 2022-12-20T19:58:31 I'm guessing we don't have a secret Nextcloud instance, it would be nice to have a Kanban as well. 2022-12-20T19:58:50 there's kanban in redmine 2022-12-20T19:58:54 there's kanban in pagure 2022-12-20T19:59:47 Pagure's existence confuses me. 2022-12-20T20:00:20 it was meant as a replacement for redmine and gitlab 2022-12-20T20:00:36 gitea was deployed like 2 years after pagure was deployed 2022-12-20T20:00:48 so you know 2022-12-20T20:01:17 * Luciano[m] nods! 2022-12-20T20:01:38 we would have happily accommodated the needs there, as usual though there is hardly any comms between suse and heroes when it comes to infra under opensuse.org domains 2022-12-20T20:01:43 * Luciano[m] found it, it's the Agile plugin! 2022-12-20T20:02:43 Yeah, maintaining all of those "replicas" are tough. 2022-12-20T20:04:09 s/are/is 2022-12-20T20:06:17 migrating away from gitlab would be great considering it's broken all the time 2022-12-20T20:08:06 funnily I heard claims that the SUSE gitlab (which runs with the same packages) is _not_ broken 2022-12-20T20:08:44 (╯°□°)╯︵ ┻━┻ What! 2022-12-20T20:10:51 well same packages but not the same version 2022-12-20T20:11:06 and gitlab-ce version updates often come with surprises 2022-12-20T20:11:47 I use that gitlab all the time and it does sometimes have major issues too 2022-12-20T20:11:55 ci just did not work whatsoever for two weeks 2022-12-20T20:12:00 gitlab-ce-15.6.0+git0.7f1a7c62d-lp154.351.2.x86_64 ("fine") vs gitlab-ce-15.6.2+git0.2d7e47019-355.6.x86_64 (broken) 2022-12-20T20:12:11 ci is a different issue because of workers 2022-12-20T20:12:19 well, sure 2022-12-20T20:12:37 still, it's frustrating 2022-12-20T20:12:45 yeah 2022-12-20T20:14:24 this did not work out, huh? https://languages.opensuse.org/Main_Page 2022-12-20T20:14:32 considering last update of the main page was in 2012? 2022-12-20T20:15:01 the non-english wikis are horribly out of sync 2022-12-20T20:15:20 the german which is the only one I can understand is a time machine to SLE 11 2022-12-20T20:15:21 indeed 2022-12-20T20:18:55 hellcp: I noticed you updated MR 577 today with more changes than I'd sum up under "Use pgbouncer again" 2022-12-20T20:19:27 since the original MR is quite old, and IIRC there were some issues left with pgbouncer - do you want it merged, or are there reasons to wait? 2022-12-20T20:19:33 that's true, it's probably closer to "sync up with the matrix machine" 2022-12-20T20:21:35 I just updated the title ;-) 2022-12-20T20:21:50 acidsys: SLE 11, when the Earth was thought to be a box... 2022-12-20T20:22:14 *** teepee_ is now known as teepee 2022-12-20T20:23:30 hellcp: so - to merge or not to merge? (I didn't notice obvious errors, but I'm not really familiar with matrix) 2022-12-20T20:24:05 it's fine to merge imo 2022-12-20T20:25:16 Out of curiosity, OBS groups - including adding people to them - are all done by OBS admins? 2022-12-20T20:25:53 ok, merged 2022-12-20T20:26:58 right, because it needs permissions to create these OBS groups (not sure if adding someone to the group can be done by a group member) 2022-12-20T20:28:11 the latter works (groups have sort of "group maintainers" which can manage other users in the group) 2022-12-20T20:28:25 but the initial maintainer obs admin needs to assign when creating the group 2022-12-20T20:28:55 Ah, that makes sense from what I see in OBS. 2022-12-20T20:31:53 https://progress.opensuse.org/issues/122254 2022-12-20T20:32:42 I do have capabilities to create and add people to groups actually, I just don't know if somebody won't be offended I stepped in their shit 2022-12-20T20:33:35 I wouldn't expect people to be offended if you do their work ;-) 2022-12-20T20:34:09 you know, I don't know what policies there are behind group creation 2022-12-20T20:35:03 you may have questions about the ticket I created, like for example why is software-o-o, something that doesn't have any login capabilities behind a login proxy 2022-12-20T20:35:16 and the answer to that is: who knows tbh 2022-12-20T20:36:23 keep in mind some of those are behind obs-login proxy, which is even more of a mystery 2022-12-20T20:36:51 it's not like any of mere mortals have access to login proxies anyway 2022-12-20T20:36:53 Time knows it, but it won't tell us. 2022-12-20T20:37:31 I do have to set up tsp-test instance of tsp for testing migrating to openidc 2022-12-20T20:37:46 I should probably request secrets for that 2022-12-20T20:40:31 And talking about openID, Jenkins openID plugin is broken - and left for adoption -, gonna try some workarounds, though. 2022-12-20T21:00:58 Now that the tests went green (thanks to fixes in various nginx-related pillars), does someone want to review https://gitlab.infra.opensuse.org/infra/salt/-/merge_requests/600 ? 2022-12-20T21:10:58 cboltz: Is there any reason to use Bash in the script? I find string manipulations to be more faster. 2022-12-20T21:11:10 s/to use/to not use 2022-12-20T21:12:25 I mean, for such cases (the script uses a Bash shebang). 2022-12-20T21:13:37 no strong reason, I just went for the easy way, and I'd guess that you can't measure a real difference in this case (the slow part is creating and checking the nginx config) 2022-12-20T21:16:13 Oh, OK. I was expecting to have a lot of files in that directory. 2022-12-20T21:18:11 I don't have exact numbers because in some pillar files there's a for loop, but I'd guess maybe 40 files in total for all roles 2022-12-20T21:24:51 Pff! That's a small number indeed. Well, let me test that shell construct here. 2022-12-20T21:28:32 if I click any MR's older than 600 I get a 500. I suppose that's a new feature as well? 2022-12-20T21:29:33 indeed, known issue 2022-12-20T21:43:31 "And talking about openID..." <- Why not use oidc then? 2022-12-20T21:45:16 Hmm, I'm gonna take a look at that later. 2022-12-20T22:08:30 https://gitlab.infra.opensuse.org/infra/salt/-/merge_requests/602 another mr with paste changes 2022-12-20T22:12:55 I guess we should give profile.pagure.redis a less service-specific name ;-) but that's something for a later MR 2022-12-20T22:15:25 indeed 2022-12-20T22:15:26 merged 2022-12-20T22:16:09 we are using redis on other vms too, but I don't think all of them have been done with salt 2022-12-20T22:16:24 discourse comes to mind 2022-12-20T22:16:28 * Luciano[m] found some oddity with !600 and is testing it. 2022-12-20T22:40:01 cboltz: Confirmed, I'm getting "rolestatus=1" even though I have .conf files in my testing dir. That `continue` seems to be messing up with the logic. If I were to make a change to do what you envisioned, I'd use a little bit different logic there. 2022-12-20T22:43:11 I'd use a Bash array to store the config file names and instead of using "nginx -tq || rolestatus=1", a "if $(nginx -tq) && test -n "$nginx_config_files; then", so the test fails quicker if there's no files. 2022-12-20T22:44:05 s/files/.*conf files 2022-12-20T22:44:53 Argh, that's right, "no files"! 2022-12-20T22:45:57 hmm, that's interesting - I tested with both "wrong" filenames (before today's merges) and "right" filenames (current production), and rolestatus (and also the final result) was always correct 2022-12-20T22:46:47 Yeah, that's interesting. Let me double-double check. 2022-12-20T22:47:16 "fail quicker" sounds good, but please make sure that it still runs all tests (well, it won't "see" the non-*.conf files) - "wasting" 5 seconds to get all error messages at once is worth the time 2022-12-20T22:47:41 also note that we have roles that include nginx, but don't create a file in vhosts.d 2022-12-20T22:48:22 (probably a sign that these roles need more salting ;-) 2022-12-20T22:49:25 Umhm, noted! 2022-12-20T23:10:16 acidsys: what is the client_id of the oidc setup for paste? 2022-12-20T23:11:21 oh, it's paste-test.opensuse.org 2022-12-20T23:11:31 it seems that the callback url is wrong 2022-12-20T23:16:35 oh well, in any case, it works 2022-12-20T23:25:51 (except login) 2022-12-20T23:37:44 Having troubles with login too? 😝 2022-12-20T23:43:58 well, I don't have acidsys around to tell me what the settings are 2022-12-20T23:44:13 and the file with info only had the secret so I had to guess 2022-12-20T23:45:42 Right, let me play with Jenkins now and see whether oidc plugin is in a better shape. 2022-12-20T23:47:49 BTW cboltz, I retested again and it succeeded. It was some dangling files that I don't know how they got created in my test environment. So the issue was between the screen and the chair :^P 2022-12-20T23:50:31 I could improve a thing or two in that script, though. Let me add it to my To Do.