2019-10-30T00:52:47 *** boombatower has quit IRC (Quit: Konversation terminated!) 2019-10-30T03:34:06 *** srinidhi has joined #opensuse-admin 2019-10-30T03:46:02 *** okurz has quit IRC (Ping timeout: 240 seconds) 2019-10-30T03:46:11 *** okurz has joined #opensuse-admin 2019-10-30T04:18:05 *** srinidhi has quit IRC (Ping timeout: 250 seconds) 2019-10-30T04:32:22 *** srinidhi has joined #opensuse-admin 2019-10-30T06:00:21 *** srinidhi has quit IRC (Ping timeout: 250 seconds) 2019-10-30T06:11:08 *** srinidhi has joined #opensuse-admin 2019-10-30T06:57:22 *** jadamek has joined #opensuse-admin 2019-10-30T07:18:40 *** moozaad has joined #opensuse-admin 2019-10-30T07:20:03 thanks guys for the investigation 2019-10-30T07:31:03 *** guillaume_g has joined #opensuse-admin 2019-10-30T07:51:14 *** marxin has quit IRC (Quit: Leaving) 2019-10-30T07:52:13 *** marxin has joined #opensuse-admin 2019-10-30T09:44:54 *** klein has quit IRC (Quit: leaving) 2019-10-30T09:45:10 *** klein has joined #opensuse-admin 2019-10-30T09:46:35 *** klein has quit IRC (Client Quit) 2019-10-30T09:47:08 *** klein has joined #opensuse-admin 2019-10-30T09:48:05 *** ldevulder_ has joined #opensuse-admin 2019-10-30T09:51:23 *** ldevulder has quit IRC (Ping timeout: 246 seconds) 2019-10-30T09:58:12 *** ldevulder_ is now known as ldevulder 2019-10-30T10:05:34 ok, redmine is connectiong to HAProxy, and HAProxy connects to MySQL galera cluster 2019-10-30T10:06:47 HAProxy is listening on 3307 (mysql with ssl) but its backend is galera cluster on 3306, so, if we have a SSL problem, it is happening on HAProxy 2019-10-30T10:07:10 I will keep diggin in 2019-10-30T10:31:59 ok, there is indeed a problem with the mysql SSL connection, I don't know yet how to fix it, but, as a workaround I have set the conection from redmine to the database to be "plain text" (withtou ssl) 2019-10-30T10:32:07 and progress.o.o is back online 2019-10-30T10:33:00 okurz: can you test it and maybe give me some feedback? I can see that the page opens, but I never have used redmine in my life 2019-10-30T10:34:27 klein: I could read tickets and update one so that looks good. Thanks so far 2019-10-30T10:34:54 kudos to klein, he investigated (and fixed) this all by himself :-) 2019-10-30T10:35:21 however we need to find a permaenent solution, as the current one is only a work-around for the time being 2019-10-30T10:41:04 no, cboltz helped me a lot, he "draw the network" for me, then I just followed the track :-) 2019-10-30T10:41:22 I will try to find if the issue is with the client certificate or if it is on the galera cluster 2019-10-30T10:41:36 learning how to do a mysql ssl connection right now :-) 2019-10-30T10:42:11 *** Eighth_Doctor has quit IRC (Remote host closed the connection) 2019-10-30T10:42:18 *** okurz[m] has quit IRC (Read error: Connection reset by peer) 2019-10-30T10:42:28 *** henne has quit IRC (Read error: Connection reset by peer) 2019-10-30T10:43:07 kbabioch: so is the problem linked to progress.o.o OS being too old? 2019-10-30T10:47:35 no idea yet. we didn't touch those systems and are surprised that it ever worked the way it is configured 2019-10-30T10:47:39 but needs more investigation 2019-10-30T10:47:42 no time at the moment 2019-10-30T11:08:08 if I test the certificate against the CA file in redmine, it says that the cert os OK 2019-10-30T11:08:32 but, maybe when HAProxy does that, it is NOK 2019-10-30T11:09:16 *** henne has joined #opensuse-admin 2019-10-30T11:10:51 nope, downloaded freeipa-ca.crt from /etc/haproxy and tested the cert file downloaded from redmine, and, it says also OK: 2019-10-30T11:10:57 openssl verify -CAfile freeipa-ca.crt redmine.infra.opensuse.org.crt 2019-10-30T11:10:57 redmine.infra.opensuse.org.crt: OK 2019-10-30T11:11:26 need to check if there is any CA on the galera cluster... 2019-10-30T11:16:28 but, my user doesn't exist in galera{1,2,3}.o.o :-( 2019-10-30T11:16:46 okurz: do you have access to those machines? 2019-10-30T11:23:55 *** srinidhi has quit IRC (Ping timeout: 268 seconds) 2019-10-30T11:36:14 *** srinidhi has joined #opensuse-admin 2019-10-30T11:51:03 *** srinidhi has quit IRC (Ping timeout: 264 seconds) 2019-10-30T12:04:26 *** srinidhi has joined #opensuse-admin 2019-10-30T12:14:55 *** Eighth_Doctor has joined #opensuse-admin 2019-10-30T12:15:02 also we don't have those machines in salt and/or in the opensuse team pass ... one more of those "interesting" things about our infrastructure :-/ 2019-10-30T13:12:57 *** srinidhi has quit IRC (Read error: Connection reset by peer) 2019-10-30T13:28:30 now we need to find someone that has access to those machines 2019-10-30T13:30:28 *** srinidhi has joined #opensuse-admin 2019-10-30T14:12:22 *** srinidhi has quit IRC (Disconnected by services) 2019-10-30T14:32:22 I checked, I would not know how to login to galera1.i.o.o 2019-10-30T14:45:41 *** boombatower has joined #opensuse-admin 2019-10-30T14:55:43 *** srinidhi has joined #opensuse-admin 2019-10-30T14:57:08 *** srinidhi has quit IRC (Read error: Connection reset by peer) 2019-10-30T14:57:18 *** srinidhi has joined #opensuse-admin 2019-10-30T14:59:24 *** srinidhi has quit IRC (Remote host closed the connection) 2019-10-30T17:06:54 *** cboltz has joined #opensuse-admin 2019-10-30T17:08:09 *** guillaume_g has quit IRC (Quit: Konversation terminated!) 2019-10-30T17:10:06 i've made my ways into those machines and re-set the root pw (see opensuse password store) ... also added rklein's key. christian should also have access ... 2019-10-30T17:10:27 also fixed an issue with non-existing swap file ... but those machines are not in a good shape ... not in salt, no updates in a long time, etc. 2019-10-30T17:11:48 also mysql on galera1.infra.opensuse.org is not starting up ... so we have a two node setup basically 2019-10-30T17:48:54 kbabioch: I guess with "those machines" you mean galera[1-3].infra.o.o? 2019-10-30T17:49:21 for the not starting mysql on galera1 - 2019-10-30T17:50:35 IIRC I've watched thomic (via tmux) fixing it once (and probably have a log of it somewhere[tm]) 2019-10-30T17:50:59 *** jadamek has quit IRC (Quit: Leaving) 2019-10-30T17:51:42 but that doesn't make me an expert for this setup 2019-10-30T17:51:48 darix knows it much better 2019-10-30T17:52:53 on a more general note - we've had quite some fun, therefore I wonder if we should pick something that doesn't easily break 2019-10-30T17:53:08 (a single mysql node would have a better uptime than this cluster ;-) 2019-10-30T17:58:39 err - actually that tmux session was with tampakrap 2019-10-30T18:39:08 well, we have a similar setup internally and know 1-2 things about galera / mysql ... but still this setup is complex and needs someone to watch after it 2019-10-30T18:39:23 but on the bright side: we have monitoring for it, which even complains about it :-) 2019-10-30T18:39:31 but now we need time to look into it ;-) 2019-10-30T19:37:44 yes, sounds like a good idea 2019-10-30T19:43:11 *** moozaad has quit IRC (Quit: Konversation terminated!) 2019-10-30T19:46:17 I agree that a single mariadb node with proper backups would be better than a cluster like this. Maybe a master>slave setup 2019-10-30T19:46:47 I need to finish some internal SUSE work, but will check that SSL connectivity as soon as I finish the SUSE work 2019-10-30T19:55:04 I've just read the IRC log - disabling SSL is indeed a workaround, but at least progress.o.o works again :-) 2019-10-30T20:16:34 we need more complex systems!!!111elf ... let's put the galera cluster into containers and kubernetes ... and then let's add some load balancing with haproxy, etc. pp. :-) 2019-10-30T20:17:59 overengineering FTW haha 2019-10-30T20:20:15 that seems to be a pattern in our infrastructure(s) 2019-10-30T20:34:19 you forgot geo-redundancy, so make sure that you also have some instances running on the other side of the world ;-) 2019-10-30T20:36:31 yeah, the K8S machines should be distributed hahaha 2019-10-30T20:40:17 btw: found the issue with ssl in the galera cluster 2019-10-30T20:40:24 the server certificates are expired 2019-10-30T20:40:26 Not After : Oct 29 14:54:54 2019 GMT 2019-10-30T20:41:20 on the redmine side, the certificate is still valid (a few days left) 2019-10-30T20:41:31 so I guess you are talking about the certificate on the galera side? 2019-10-30T20:41:34 yeah, but the client will refuse to connect when the server cert is expired 2019-10-30T20:41:46 the client cert will expire tomorrow 2019-10-30T20:41:50 so all of them have to be replaced 2019-10-30T20:42:02 this will make the "disable ssl work-around unnecessary" 2019-10-30T20:42:13 indeed :-) 2019-10-30T20:44:17 *** dddh has quit IRC (Ping timeout: 240 seconds) 2019-10-30T20:45:44 *** dddh has joined #opensuse-admin 2019-10-30T20:46:44 nice 2019-10-30T20:47:12 can we work on that tomorrow kbabioch ? I would like to learn how to renew those certs on freeipa 2019-10-30T20:51:02 yup, can also do this tomorrow 2019-10-30T21:25:57 okay, so galera1.infra.opensuse.org is in the lcuster again ... basically dropped the mysql data and re-synced from the current master 2019-10-30T21:26:10 monitoring is also much more happy now ;-) 2019-10-30T21:26:47 sounds like you used the sledge hammer - but as long as it works... ;-) 2019-10-30T21:27:52 yeah, well ... don't want to process corrupted mysql data ;-) 2019-10-30T21:28:02 especially when we have two working copies around 2019-10-30T21:28:26 I'd guess it was "only" outdated, but I get your point 2019-10-30T21:28:42 no, it wasn't outdated ... mysql itself wouldn't start up 2019-10-30T21:28:46 something was corrupt there 2019-10-30T21:29:15 sounds interesting[tm]... 2019-10-30T21:30:04 i have a backup of the data, if you want to work on it ;-) 2019-10-30T21:30:23 thanks, but no thanks ;-) 2019-10-30T23:01:17 *** dddh has quit IRC (Remote host closed the connection) 2019-10-30T23:12:44 *** dddh has joined #opensuse-admin 2019-10-30T23:25:24 *** cboltz has quit IRC ()