2025-12-07T02:23:52 *** teepee_ is now known as teepee 2025-12-07T14:06:17 *** teepee_ is now known as teepee 2025-12-07T17:24:58 I added the apachectl patch for Leap 16 back to our Salt package 2025-12-07T17:25:55 thanks! 2025-12-07T19:19:15 CI still fails, but now apparently while loading functions from os_pillar 2025-12-07T19:27:08 so we are making progress... 2025-12-07T19:27:20 ModuleNotFoundError: No module named 'pkg_resources' 2025-12-07T19:28:00 that one's a DEBUG message, it should be ignorable (we do not use the `pip` module which needs this) 2025-12-07T19:28:55 I'm a bit surprised that something like this is "only" DEBUG, but - confirmed 2025-12-07T19:29:15 it's an optional feature 2025-12-07T19:39:04 there's another DEBUG right above the actual ERROR: 2025-12-07T19:39:11 2025-12-07 17:28:10,839 [salt.utils.lazy :300 ][DEBUG ][1523] Could not LazyLoad os_pillar.get_host_ip6: 'os_pillar.get_host_ip6' is not available. 2025-12-07T19:40:22 near the end of the log, there is 2025-12-07T19:40:25 2025-12-07 17:28:11,910 [salt.utils.event :300 ][DEBUG ][2030] Sending event: tag = salt/run/20251207172811394340/ret; data = {'fun': 'runner.saltutil.sync_modules', 'jid': '20251207172811394340', 'user': 'salt', 'fun_args': [], '_stamp': '2025-12-07T17:28:11.910248', 'return': ['modules.compose', 'modules.os_network', 'modules.os_pillar', 'modules.podmanmod', 'modules.suse_sysconfig', 'modules.susejunos', 'modules.user_service', 'modules. 2025-12-07T19:40:27 vmhelper'], 'success': True} 2025-12-07T19:40:34 and a few lines later 2025-12-07T19:40:36 2025-12-07 17:28:11,918 [salt.runner :300 ][DEBUG ][2030] Runner return: ['modules.compose', 'modules.os_network', 'modules.os_pillar', 'modules.podmanmod', 'modules.suse_sysconfig', 'modules.susejunos', 'modules.user_service', 'modules.vmhelper'] 2025-12-07T19:40:55 so - could it be that salt/_modules/os_pillar.py gets loaded too late? 2025-12-07T20:02:39 possible, but not sure, will try with refresh=false, though it was not needed in the past 2025-12-07T20:05:04 did not help 2025-12-07T20:07:32 the lines I pasted seem to be from sync_modules - can we run that earlier (and maybe add a sleep 5 afterwards for testing)? 2025-12-07T20:07:47 when is earlier ? 2025-12-07T20:08:54 right now it happens a second after the error, so two or three seconds earlier might help 2025-12-07T20:09:49 the script is not time aware, it goes over each action in series 2025-12-07T20:10:46 wild guess: maybe sync_modules takes longer now (and/or runs in background) - "sleep 5" might be worth a try, even if I don't really like it 2025-12-07T20:11:26 afaik the sync_* functions should be synchronous, only the refresh_* ones are async (and hence also have an optional `wait` parameter) 2025-12-07T20:11:42 but sure 2025-12-07T20:12:19 well, "should" - you know theory and practise ;-) 2025-12-07T20:16:02 no luck. attempts are in !2674 2025-12-07T20:17:18 might need to play with it locally and go one by one 2025-12-07T20:21:01 s/wait/sleep/ please 2025-12-07T20:21:21 (according to the manpage, wait expects a pid as parameter, not a number of seconds) 2025-12-07T20:22:18 woopsie 2025-12-07T20:32:21 will be back later, feel free to push to my branch if you want to play more ^^ 2025-12-07T20:35:05 if I only knew what to change... 2025-12-07T20:35:46 the log still looks similar, sync_modules (at least the one mentioning os_pillar) is still done after the error 2025-12-07T20:36:05 one thing I wonder - shortly before that sync_modules, there is 2025-12-07T20:36:07 2025-12-07 20:24:36,846 [salt.utils.extmods:300 ][INFO ][2029] Copying '/var/cache/salt/master/files/base/_modules/os_pillar.py' to '/var/cache/salt/master/extmods/modules/os_pillar.py' 2025-12-07T20:36:24 is this also something that should happen earlier? 2025-12-07T21:43:57 it should put a copy in /var/cache/salt/master/files/production/_modules and in /var/cache/salt/master/extmods/modules, the salt.utils.extmod is probably only about the latter, there might not be log output for files/ 2025-12-07T21:44:42 it needs to be installed before the pillar is rendered for the first time, and it appears something triggers the pillar to be rendered before sync_modules 2025-12-07T21:48:32 however relevant for this is the runner (master) sync not the minion sync, so I just added the sleep there as well now for good measure 2025-12-07T22:04:52 would some { date ; echo "=== now running $command ===" ; } >> /var/log/salt/master.log at various places help to find out when rendering the pillar gets triggered? 2025-12-07T22:17:49 I'll try ;-) 2025-12-07T22:28:13 the error is at 22:33:25,603 2025-12-07T22:28:26 after the 2025-12-07 22:23:25,775 log line, the first debugging gets logged: 2025-12-07T22:28:31 ==== Sun Dec 7 22:23:25 UTC 2025 before runner_run saltutil.sync_modules ==== 2025-12-07T22:29:03 so the pillar rendering most likely happens shortly above the first log_cmd call 2025-12-07T22:35:55 (but since the times are so close, in theory things could get mixed up because two different processes (salt and bash) write to the salt master log - if one of them has a delayed write...) 2025-12-07T22:54:53 more debugging added, and a sleep after systemctl start salt-minion 2025-12-07T22:55:24 it looks like the error happens while the minion gets started 2025-12-07T22:56:55 and it errors again during the until timeout ... salt-call loop 2025-12-07T22:57:00 interesting, we could try to move the runner sync to before starting the minion 2025-12-07T22:57:39 I'll mv $0 /dev/bed and hand over to you ;-) 2025-12-07T22:57:55 good luck and good night 2025-12-07T22:57:59 ok, letting you return ENOSLEEP 2025-12-07T22:58:00 good night