Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for building on OpenBSD #22

Open
takusuman opened this issue Mar 27, 2023 · 31 comments
Open

Fixes for building on OpenBSD #22

takusuman opened this issue Mar 27, 2023 · 31 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@takusuman
Copy link
Member

takusuman commented Mar 27, 2023

As Molly stated in #21:

Note that other changes are required outside of the scope of this pull request to ensure the entire project builds under OpenBSD.

New issues were discovered after fixing this one; T\this includes libcrypt not being found on OpenBSD for the su program, whodo not compiling, and pgrep and ps not compiling due to KVM function errors (kernel virtual memory? looks to be something to do with libkvm).

This issue will work as a list of what needs to be made for running Heirloom NG on OpenBSD.

@mamccollum If I were you, I would create a branch called "openbsd-compat" and make changes on it, it will be easier to apply the changes later in a "big patch".

@mamccollum
Copy link
Contributor

mamccollum commented Mar 28, 2023

I would create a branch called "openbsd-compat" and make changes on it, it will be easier to apply the changes later in a "big patch".

Good idea, I'll start working on this. Instead of doing small PRs, I'll work on a large one for this issue encompassing all the current build issues.

@takusuman
Copy link
Member Author

I would create a branch called "openbsd-compat" and make changes on it, it will be easier to apply the changes later in a "big patch".

Good idea, I'll start working on this. Instead of doing small PRs, I'll work on a large one for this issue encompassing all the current build issues.

Good luck.
If you need any help, we will be here.

@mamccollum
Copy link
Contributor

One of the issues at compile time is pgrep.c, but I'm honestly baffled at what exactly is going on there (and the only comment shows that whoever worked on this last also was confused, as on line 902 it's just a single "?").

egcc -O -fomit-frame-pointer  -D_GNU_SOURCE -I/usr/local/include  -I../libuxre -DUXRE  -I../libcommon -c pgrep.c
pgrep.c: In function 'oneproc':
pgrep.c:897:22: error: 'struct kinfo_proc' has no member named 'kp_proc'
  897 |         p->p_pid = kp->kp_proc.p_pid;
      |                      ^~
pgrep.c:898:31: error: 'struct kinfo_proc' has no member named 'kp_proc'
  898 |         strncpy(p->p_fname, kp->kp_proc.p_comm, sizeof p->p_fname);
      |                               ^~
pgrep.c:900:23: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  900 |         p->p_ppid = kp->kp_eproc.e_ppid;
      |                       ^~
pgrep.c:901:23: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  901 |         p->p_pgid = kp->kp_eproc.e_pgid;
      |                       ^~
pgrep.c:902:22: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  902 |         p->p_sid = kp->kp_eproc.e_tpgid;        /* ? */
      |                      ^~
pgrep.c:903:25: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  903 |         p->p_ttydev = kp->kp_eproc.e_tdev;
      |                         ^~
pgrep.c:904:22: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  904 |         p->p_uid = kp->kp_eproc.e_pcred.p_ruid;
      |                      ^~
pgrep.c:905:23: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  905 |         p->p_euid = kp->kp_eproc.e_ucred.cr_uid;
      |                       ^~
pgrep.c:906:22: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  906 |         p->p_gid = kp->kp_eproc.e_pcred.p_rgid;
      |                      ^~
pgrep.c:907:23: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  907 |         p->p_egid = kp->kp_eproc.e_ucred.cr_gid;
      |                       ^~
pgrep.c:908:24: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  908 |         p->p_start = kp->kp_eproc.e_pstats.p_start.tv_sec;
      |                        ^~
pgrep.c:909:23: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  909 |         p->p_size = kp->kp_eproc.e_vm.vm_tsize +
      |                       ^~
pgrep.c:910:19: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  910 |                 kp->kp_eproc.e_vm.vm_dsize +
      |                   ^~
pgrep.c:911:19: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  911 |                 kp->kp_eproc.e_vm.vm_ssize;
      |                   ^~
pgrep.c: In function 'collectprocs':
pgrep.c:943:49: warning: passing argument 4 of 'kvm_getprocs' makes integer from pointer without a cast [-Wint-conversion]
  943 |         kp = kvm_getprocs(kt, KERN_PROC_ALL, 0, &cnt);
      |                                                 ^~~~
      |                                                 |
      |                                                 int *
In file included from pgrep.c:57:
/usr/include/kvm.h:67:43: note: expected 'size_t' {aka 'long unsigned int'} but argument is of type 'int *'
   67 |           kvm_getprocs(kvm_t *, int, int, size_t, int *);
      |                                           ^~~~~~
pgrep.c:943:14: error: too few arguments to function 'kvm_getprocs'
  943 |         kp = kvm_getprocs(kt, KERN_PROC_ALL, 0, &cnt);
      |              ^~~~~~~~~~~~
In file included from pgrep.c:57:
/usr/include/kvm.h:67:11: note: declared here
   67 |           kvm_getprocs(kvm_t *, int, int, size_t, int *);
      |           ^~~~~~~~~~~~
gmake: *** [Makefile:326: pgrep.o] Error 1

If any of you can give me a bit of a rundown as to what's going on here, it'd be appreciated.
This is on OpenBSD 7.2 w/ gcc version 11.2.0 (GCC).

@takusuman
Copy link
Member Author

takusuman commented Mar 28, 2023

If any of you can give me a bit of a rundown as to what's going on here, it'd be appreciated. This is on OpenBSD 7.2 w/ gcc version 11.2.0 (GCC).

Sorry for the delay.

One of the issues at compile time is pgrep.c, but I'm honestly baffled at what exactly is going on there (and the only comment shows that whoever worked on this last also was confused, as on line 902 it's just a single "?").

"""UNIX hacking""" as always. It's more like alchemy than "science". 😆

PS: Incredible. https://github.com/Projeto-Pindorama/heirloom-ng/blob/master/pgrep/pgrep.c#L902

egcc -O -fomit-frame-pointer  -D_GNU_SOURCE -I/usr/local/include  -I../libuxre -DUXRE  -I../libcommon -c pgrep.c
pgrep.c: In function 'oneproc':
pgrep.c:897:22: error: 'struct kinfo_proc' has no member named 'kp_proc'
  897 |         p->p_pid = kp->kp_proc.p_pid;
      |                      ^~
pgrep.c:898:31: error: 'struct kinfo_proc' has no member named 'kp_proc'
  898 |         strncpy(p->p_fname, kp->kp_proc.p_comm, sizeof p->p_fname);
      |                               ^~
pgrep.c:900:23: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  900 |         p->p_ppid = kp->kp_eproc.e_ppid;
      |                       ^~
pgrep.c:901:23: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  901 |         p->p_pgid = kp->kp_eproc.e_pgid;
      |                       ^~
pgrep.c:902:22: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  902 |         p->p_sid = kp->kp_eproc.e_tpgid;        /* ? */
      |                      ^~
pgrep.c:903:25: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  903 |         p->p_ttydev = kp->kp_eproc.e_tdev;
      |                         ^~
pgrep.c:904:22: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  904 |         p->p_uid = kp->kp_eproc.e_pcred.p_ruid;
      |                      ^~
pgrep.c:905:23: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  905 |         p->p_euid = kp->kp_eproc.e_ucred.cr_uid;
      |                       ^~
pgrep.c:906:22: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  906 |         p->p_gid = kp->kp_eproc.e_pcred.p_rgid;
      |                      ^~
pgrep.c:907:23: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  907 |         p->p_egid = kp->kp_eproc.e_ucred.cr_gid;
      |                       ^~
pgrep.c:908:24: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  908 |         p->p_start = kp->kp_eproc.e_pstats.p_start.tv_sec;
      |                        ^~
pgrep.c:909:23: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  909 |         p->p_size = kp->kp_eproc.e_vm.vm_tsize +
      |                       ^~
pgrep.c:910:19: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  910 |                 kp->kp_eproc.e_vm.vm_dsize +
      |                   ^~
pgrep.c:911:19: error: 'struct kinfo_proc' has no member named 'kp_eproc'
  911 |                 kp->kp_eproc.e_vm.vm_ssize;
      |                   ^~
pgrep.c: In function 'collectprocs':
pgrep.c:943:49: warning: passing argument 4 of 'kvm_getprocs' makes integer from pointer without a cast [-Wint-conversion]
  943 |         kp = kvm_getprocs(kt, KERN_PROC_ALL, 0, &cnt);
      |                                                 ^~~~
      |                                                 |
      |                                                 int *
In file included from pgrep.c:57:
/usr/include/kvm.h:67:43: note: expected 'size_t' {aka 'long unsigned int'} but argument is of type 'int *'
   67 |           kvm_getprocs(kvm_t *, int, int, size_t, int *);
      |                                           ^~~~~~
pgrep.c:943:14: error: too few arguments to function 'kvm_getprocs'
  943 |         kp = kvm_getprocs(kt, KERN_PROC_ALL, 0, &cnt);
      |              ^~~~~~~~~~~~
In file included from pgrep.c:57:
/usr/include/kvm.h:67:11: note: declared here
   67 |           kvm_getprocs(kvm_t *, int, int, size_t, int *);
      |           ^~~~~~~~~~~~
gmake: *** [Makefile:326: pgrep.o] Error 1

At a first glance, this seems like an "API breakage" to me.
Maybe the Kernel Virtual Memory library that Heirloom used to use back in the 2000's isn't the same anymore in modern BSD (you can see that from the fact that the structs haven't some fields that it used to have before, such as kp_eproc on the struct kinfo_proc), or maybe I'm just talking plain bullsh?t without knowing in fact what is going under the table.

I think that, on more recent versions (that I don't know exactly) of OpenBSD, one will need to disable -lkvm at build/mk.config or another boilerplate will be needed.

@mamccollum
Copy link
Contributor

Heyo, I've been working on the OpenBSD stuff recently. I got pgrep and everything listed before ps to work under my fork. I had to include -lkvm. I also tested the same code on Ubuntu to ensure the changes didn't break existing systems, & it appears it did not.
pgrep was able to list multiple PIDs from a query and pkill was able to kill said PIDs on OpenBSD. I'm currently working on ps but kinda stumped on how to fix the stuff under getproc() in ps.c. Advice would be appreciated.

@mamccollum
Copy link
Contributor

PS. The KVM stuff did change so I fixed that in pgrep but not ps yet.

@takusuman
Copy link
Member Author

Heyo, I've been working on the OpenBSD stuff recently. I got pgrep and everything listed before ps to work under my fork. I had to include -lkvm. I also tested the same code on Ubuntu to ensure the changes didn't break existing systems, & it appears it did not. pgrep was able to list multiple PIDs from a query and pkill was able to kill said PIDs on OpenBSD. I'm currently working on ps but kinda stumped on how to fix the stuff under getproc() in ps.c. Advice would be appreciated.

Well, that's good. About including -lkvm, you can just use $(LKVM) at pgrep's Makefile.mk, as it's already there. It's enabled via build/mk.config.

About ps, what's the problem? If you could resend it, it may have it own issue here. This would help to know what exactly to fix.

@mamccollum
Copy link
Contributor

Well, that's good. About including -lkvm, you can just use $(LKVM) at pgrep's Makefile.mk, as it's already there. It's enabled via build/mk.config.

Yes, I added -lkvm through uncommenting it in mk.config.

I'll post the build logs for ps once I remote into the server I'm using for OpenBSD later.

@mamccollum
Copy link
Contributor

Alright, I decided to remote in while I had the chance; this is the excerpt of the build logs.

gmake[1]: Entering directory '/home/mamccollum/heirloom-ng/ps'
cc -O -fomit-frame-pointer  -D_GNU_SOURCE -I/usr/local/include   -I../libcommon -DUSE_PS_CACHE -DDEFAULT='"/etc/default/ps"' -c ps.c
ps.c:2598:11: warning: 5 enumeration values not handled in switch: 'CR_INVALID_EFF_UID', 'CR_INVALID_REAL_UID', 'CR_INVALID_REAL_GID'... [-Wswitch]
                switch (ct->c_typ) {
                        ^
ps.c:3129:17: error: no member named 'kp_proc' in 'struct kinfo_proc'
        p->p_pid = kp->kp_proc.p_pid;
                   ~~  ^
ps.c:3130:26: error: no member named 'kp_proc' in 'struct kinfo_proc'
        strncpy(p->p_fname, kp->kp_proc.p_comm, sizeof p->p_fname);
                            ~~  ^
ps.c:3132:23: error: no member named 'kp_proc' in 'struct kinfo_proc'
        p->p_lstate[0] = kp->kp_proc.p_stat;
                         ~~  ^
ps.c:3133:18: error: no member named 'kp_eproc' in 'struct kinfo_proc'
        p->p_ppid = kp->kp_eproc.e_ppid;
                    ~~  ^
ps.c:3134:18: error: no member named 'kp_eproc' in 'struct kinfo_proc'
        p->p_pgid = kp->kp_eproc.e_pgid;
                    ~~  ^
ps.c:3135:17: error: no member named 'kp_eproc' in 'struct kinfo_proc'
        p->p_sid = kp->kp_eproc.e_tpgid;        /* ? */
                   ~~  ^
ps.c:3136:20: error: no member named 'kp_eproc' in 'struct kinfo_proc'
        p->p_ttydev = kp->kp_eproc.e_tdev;
                      ~~  ^
ps.c:3137:19: error: no member named 'kp_proc' in 'struct kinfo_proc'
        p->p_lflag = kp->kp_proc.p_flag;
                     ~~  ^
ps.c:3138:25: error: no member named 'kp_eproc' in 'struct kinfo_proc'
        p->p_time = tv2sec(kp->kp_eproc.e_pstats.p_ru.ru_utime.tv_sec +
                           ~~  ^
ps.c:3139:9: error: no member named 'kp_eproc' in 'struct kinfo_proc'
                                kp->kp_eproc.e_pstats.p_ru.ru_stime.tv_sec,
                                ~~  ^
ps.c:3140:8: error: no member named 'kp_eproc' in 'struct kinfo_proc'
                        kp->kp_eproc.e_pstats.p_ru.ru_utime.tv_usec +
                        ~~  ^
ps.c:3141:9: error: no member named 'kp_eproc' in 'struct kinfo_proc'
                                kp->kp_eproc.e_pstats.p_ru.ru_stime.tv_usec, 1);
                                ~~  ^
ps.c:3143:14: error: no member named 'kp_eproc' in 'struct kinfo_proc'
                tv2sec(kp->kp_eproc.e_pstats.p_cru.ru_utime.tv_sec +
                       ~~  ^
ps.c:3144:9: error: no member named 'kp_eproc' in 'struct kinfo_proc'
                                kp->kp_eproc.e_pstats.p_cru.ru_stime.tv_sec,
                                ~~  ^
ps.c:3145:8: error: no member named 'kp_eproc' in 'struct kinfo_proc'
                        kp->kp_eproc.e_pstats.p_cru.ru_utime.tv_usec +
                        ~~  ^
ps.c:3146:9: error: no member named 'kp_eproc' in 'struct kinfo_proc'
                                kp->kp_eproc.e_pstats.p_cru.ru_stime.tv_usec,
                                ~~  ^
ps.c:3148:26: error: no member named 'kp_eproc' in 'struct kinfo_proc'
        p->p_utime = tv2sec(kp->kp_eproc.e_pstats.p_ru.ru_utime.tv_sec,
                            ~~  ^
ps.c:3149:8: error: no member named 'kp_eproc' in 'struct kinfo_proc'
                        kp->kp_eproc.e_pstats.p_ru.ru_utime.tv_usec, 10);
                        ~~  ^
ps.c:3150:26: error: no member named 'kp_eproc' in 'struct kinfo_proc'
        p->p_ktime = tv2sec(kp->kp_eproc.e_pstats.p_ru.ru_stime.tv_sec,
                            ~~  ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
1 warning and 20 errors generated.
gmake[1]: *** [Makefile:326: ps.o] Error 1
gmake[1]: Leaving directory '/home/mamccollum/heirloom-ng/ps'
gmake: *** [makefile:23: all] Error 2

@takusuman
Copy link
Member Author

cc -O -fomit-frame-pointer -D_GNU_SOURCE -I/usr/local/include -I../libcommon -DUSE_PS_CACHE -DDEFAULT='"/etc/default/ps"' -c ps.c

I saw that and it caught my attention: isn't the Makefile "forgetting" to include /usr/include instead of /usr/local/include?

@takusuman
Copy link
Member Author

ps.c:3129:17: error: no member named 'kp_proc' in 'struct kinfo_proc'

O.k., I think I may have found a solution for this.
This error isn't uncommon at all, it appears everywhere when folks are trying to compile/port things for OpenBSD --- which is our case right now ---, so I've found this comment on StackOverflow about compiling net-snmp at OpenBSD 5.1: https://stackoverflow.com/a/12272730
The answer is badly written, I know, but something useful can be taken from it.
In any case, we already know that some change in how the kernel API is used plus an #if defined(__OpenBSD__) with an #if OpenBSD >= YYYYMM can solve this.
I'm being very simplistic here, but that's all I think we could do.

@mamccollum
Copy link
Contributor

cc -O -fomit-frame-pointer -D_GNU_SOURCE -I/usr/local/include -I../libcommon -DUSE_PS_CACHE -DDEFAULT='"/etc/default/ps"' -c ps.c

I saw that and it caught my attention: isn't the Makefile "forgetting" to include /usr/include instead of /usr/local/include?

Yeah, I noticed that and I just modified the mk.config to include /usr/local/include in the CPPFLAGS.

@mamccollum
Copy link
Contributor

ps.c:3129:17: error: no member named 'kp_proc' in 'struct kinfo_proc'

O.k., I think I may have found a solution for this. This error isn't uncommon at all, it appears everywhere when folks are trying to compile/port things for OpenBSD --- which is our case right now ---, so I've found this comment on StackOverflow about compiling net-snmp at OpenBSD 5.1: https://stackoverflow.com/a/12272730 The answer is badly written, I know, but something useful can be taken from it. In any case, we already know that some change in how the kernel API is used plus an #if defined(__OpenBSD__) with an #if OpenBSD >= YYYYMM can solve this. I'm being very simplistic here, but that's all I think we could do.

And on this -- I fixed pgrep using a similar solution, but I believe I forgot to put in the datestamp. I will update that in a bit. I'm not sure how to fix it for ps, but it might be that I'm not looking at it the right way.

@takusuman
Copy link
Member Author

takusuman commented Jul 3, 2023

And on this -- I fixed pgrep using a similar solution, but I believe I forgot to put in the datestamp. I will update that in a bit. I'm not sure how to fix it for ps, but it might be that I'm not looking at it the right way.

Since it's the same problem on both pgrep and ps, I don't doubt it's the same way of fixing it. Take a rest and look again, maybe you're just tired and can't concentrate well.
By the way, apparently, kp_eproc was removed back in OpenBSD 5.1. It's hard to find exactly when it was changed in git's history --- and I don't have a lot of time to search along it right now --- but, in 5.1's change log, it's clearly said that the OpenBSD team changed the FILL_KPROC() macro.
OpenBSD 5.1 was released on May 2012, so it'd be #if OpenBSD >= 201205.
It's also not that clear where the kinfo_proc (which is "aliased" to kp) is exactly defined. sysctl.h? uvm_extern.h? It doesn't actually give us a clue.

I really hope this helps a little.

@takusuman
Copy link
Member Author

takusuman commented Jul 3, 2023

Watch out for whodo too, it has kp_eproc references present on lines 1146 to 1149, 1154 and 1394 but, after fixing pgrep and ps, this may be easy. Also doesn't forget to comment, this may help other folks trying to port ancient C code to OpenBSD later. 😄

@takusuman takusuman added enhancement New feature or request help wanted Extra attention is needed labels Jul 29, 2023
@mamccollum
Copy link
Contributor

mamccollum commented Sep 3, 2023

@takusuman Finally got back into working on this under my fork, but under ps.c I got it to compile, however it segfaults and states that opening /dev/mem is not permitted (this is an intentional security feature in OpenBSD). I have not pushed any commit in regards to ps.c to the repository yet (as the current working state is... well, not working).

foo# ./ps
kvm_open: /dev/mem: Operation not permitted
Segmentation fault (core dumped) 

Is it possible I may work on another feature in the meanwhile as it appears that POSIX standardization and Linux support are the priority? If so, anything I should look into?

@takusuman
Copy link
Member Author

@takusuman Finally got back into working on this under my fork

Well, that's great. I'm a little bit busy lately, but I've been using and testing Heirloom NG directly from the master branch.

...but under ps.c I got it to compile, however it segfaults and states that opening /dev/mem is not permitted (this is an intentional security feature in OpenBSD)...

foo# ./ps
kvm_open: /dev/mem: Operation not permitted
Segmentation fault (core dumped) 

Well, nice. How do we got /dev/mem information on OpenBSD then? Do they have some sort of kernel API for that which returns a /dev/mem-"compatible" struct? If so, I think it wouldn't be so difficult to workaround that. Just thinking loud here.

Is it possible I may work on another feature in the meanwhile as it appears that POSIX standardization and Linux support are the priority? If so, anything I should look into?

Yes, sure! No problem.
Well, the only problems that I see for now is readlink (both the UCB and DEF variants) not readlink'ng/realpath'ng some files in specific, at least on Linux. watch is working just fine, seq would need to stop printing the separator on the last element --- but I don't want to remove the ternary operators nor made the code too much complex, that's why I haven't messed around with it yet. About Linux compatibility, even through this project aims to not be so Linux-centric, I haven't seen any trouble since it uses just the POSIX interfaces. You're free to test it extensively, though.

@mamccollum
Copy link
Contributor

Well, nice. How do we got /dev/mem information on OpenBSD then? Do they have some sort of kernel API for that which returns a /dev/mem-"compatible" struct? If so, I think it wouldn't be so difficult to workaround that. Just thinking loud here.

Well, I'm not exactly sure about that (as you can't access /dev/mem at all without making the kernel insecure via a sysctl) but OpenBSD has ps and it works, but I haven't exactly studied it very well yet.

And I'll look into the readlink issue in a few hours -- I'll post an update if I see anything that could be causing the issue.

@takusuman
Copy link
Member Author

takusuman commented Sep 4, 2023

Well, I'm not exactly sure about that (as you can't access /dev/mem at all without making the kernel insecure via a sysctl) but OpenBSD has ps and it works, but I haven't exactly studied it very well yet.

I think I may have to dig on this along with you. That will be a lot of fun, for sure.

And I'll look into the readlink issue in a few hours -- I'll post an update if I see anything that could be causing the issue.

Great! Merci.
There's also seq, it need to be more compliant with the standard (at least with the Plan 9 original implementation) since, for now, it only implements the separator. It would be interesting to make it use floating points too. My only apprehension is with it losing performance in comparison with the current implementation.

@takusuman
Copy link
Member Author

takusuman commented Sep 4, 2023

@mamccollum I think I may have found something promising on the /dev/kmem issue.
collectd had this issue before when OpenBSD began switching from /dev/kmem structs to a direct kernel/C libary/I-don't-know-right API.
In fact, here's the issue on GitHub (collectd/collectd#2061), the discussion may help us a lot, along with the fixes done cited at the end of the issue.
The fixes: collectd/collectd@f2530ff
collectd/collectd@b088d09

@mamccollum
Copy link
Contributor

mamccollum commented Sep 5, 2023

Well, I got it to... sorta work? Thanks for the input on the /dev/kmem stuff, it really helped.

All versions of ps compile, but with... many bugs. I'll show screenshots:

Normal ps shows this upon requesting it without options, though it successfully shows usage info:
image

ps -e shows this:
image

But when I pipe ps -e into less, the PIDs change from 65536 to 16384.

ps_s42 core dumps (if given -e, otherwise it complains about the terminal) (I'll have to look into that):
image

ps_sus does the same behavior as ps (with a different header, "CMD" instead of "COMD")

And, ps_ucb actually prints process info (if given -a, albeit there are random control characters in one of the columns):
image

Advice would be appreciated. (I must sleep now, it's almost 4am.)

@mamccollum
Copy link
Contributor

Also, I got to compile stty in the meantime by looking into an issue that a haskell library faced. Apparently, a lot of terminal options in SysV-based stty isn't supported by default in the BSDs, but it is a part of the XSI stuff.
See my additions to stty.c here.

su (without PAM, since OpenBSD doesn't use PAM) compiles, but core dumps unless you're root (but works perfectly fine as root, though that kinda defeats the purpose). I had to replace -lcrypt with -lssl as per the instructions in mk.config for NetBSD.
OpenBSD uses "BSD Auth" apparently, which is likely the reason su doesn't work. see: su.c

Also I got watch to compile as well, by adding an #if defined(__OpenBSD__) for OpenBSD that changes wait.h to sys/wait.h (only on OpenBSD, though I chose if instead of ifdef in case more BSDs need this in the future).

Please note with watch that in FreeBSD (but not NetBSD or OpenBSD) it is a completely different command, used for nosy admins to watch other terminals.

who works fine, as it just checks TTY status && their owners, not processes.

I am on to whodo/w, which will likely have the same issue as ps as it doesn't just do who but also ps info. I will attempt to fix the others prior to whodo to speed up time.

Thanks for your patience with my absence for a few months. I really needed the break. :)

@mamccollum
Copy link
Contributor

Ok, everything else compiles just fine. The last issues that have to be solved is ps, whodo, su and potentially pgrep if something is broken behind an option I didn't check.

@takusuman
Copy link
Member Author

Well, I got it to... sorta work? Thanks for the input on the /dev/kmem stuff, it really helped.

You're welcome anytime.

All versions of ps compile, but with... many bugs. I'll show screenshots:

I was expecting that, let's go...

Normal ps shows this upon requesting it without options, though it successfully shows usage info: image

I've made some search in the code, and discovered that, in the non-UCB variant (#ifndef UCB), it checks if p_ttydev is equal to PRNODEV --- which I think it means that the current terminal is not a device file? Or something like that, I don't know --- and, if it's the case, it prints this error.
The code for reference:
https://github.com/Projeto-Pindorama/heirloom-ng/blob/master/ps/ps.c#L4991-L4996

ps -e shows this: image

I know it will sound obvious, but that's really strange, never seen nothing like it.
It looks like when you miss the type on Google Go's fmt.Printf() and it prints garbage.
Maybe some memory management issue? Strange as hell.

By the way, if you find a way to fix it: don't forget to use both "#if defined(__OpenBSD__)/#if defined(__NetBSD__) and "#if OpenBSD >= YYMM (or the equivalent to NetBSD or FreeBSD that I don't know). If don't, this may break the compatibility it has with older systems --- not like anyone cares to this sort of compatibility, but who knows if one's trying to upgrade an old Heirloom installation with this fork?
I'm saying this because I saw this commit on your fork on pgrep(1) and realized that you're only checking for OpenBSD versions grander than 201205, not considering to only #if defined()' on your changes.
Anyway, if you don't have time to fix this, don't worry: I can fix later when pushing in the main branch.
Let's continue.

But when I pipe ps -e into less, the PIDs change from 65536 to 16384.

This really sounds like a memory issue or something like that.
I do not have an idea of how to fix this on C, though. You may wish to speak to other folks on the OpenBSD community for help on finding this out.

ps_s42 core dumps (if given -e, otherwise it complains about the terminal) (I'll have to look into that): image

I know the possibility is small, but maybe the kernel is sending a signal to kill the program? If I'm not completely mistaken, it's an OpenBSD payload to kill programs when it tries to access memory areas that aren't allowed per default.
I'd take a little look at that, then trying to debug it with gdb to see what's causing the core dump.

ps_sus does the same behavior as ps (with a different header, "CMD" instead of "COMD")

So... it has the same glitches as the normal ps? If so (and the changes are only an #if defined() that changes "COMD" to "CMD"), move on to fixing ps and hoping that it will also fix the sus variant.

And, ps_ucb actually prints process info (if given -a, albeit there are random control characters in one of the columns): image

The PID still glitched just as the default ps, tried to pipe it through less to see if the PID changes from 65536 to 16384? The control characters may be the simplest part to fix.

Advice would be appreciated. (I must sleep now, it's almost 4am.)

It was late to me too at that day. 😄

@takusuman
Copy link
Member Author

takusuman commented Sep 6, 2023

Also, I got to compile stty in the meantime by looking into an issue that a haskell library faced. Apparently, a lot of terminal options in SysV-based stty isn't supported by default in the BSDs, but it is a part of the XSI stuff. See my additions to stty.c here.

Well, nice to see it working. 😄
Merci for more information too! I hope this helps other folks later when researching and/or hacking on Heirloom NG's code.

su (without PAM, since OpenBSD doesn't use PAM) compiles, but core dumps unless you're root (but works perfectly fine as root, though that kinda defeats the purpose). I had to replace -lcrypt with -lssl as per the instructions in mk.config for NetBSD. OpenBSD uses "BSD Auth" apparently, which is likely the reason su doesn't work. see: su.c

Well, is there a way to fix it without having to rewrite everything? 😆
Haven't looked at OpenBSD's su yet, so I don't know how BSD Auth's interface works in comparison to whatever Heirloom's su's using.

Also I got watch to compile as well, by adding an #if defined(__OpenBSD__) for OpenBSD that changes wait.h to sys/wait.h (only on OpenBSD, though I chose if instead of ifdef in case more BSDs need this in the future).

Merci, I saw that! I'm happy to see this implementation working on other systems.

Please note with watch that in FreeBSD (but not NetBSD or OpenBSD) it is a completely different command, used for nosy admins to watch other terminals.

Yeah, I saw that when implementing. I even though of changing the name from watch to cmdwatch and try to implement FreeBSD's watch, but, in the end, it would be a lot of work.

who works fine, as it just checks TTY status && their owners, not processes.

I am on to whodo/w, which will likely have the same issue as ps as it doesn't just do who but also ps info. I will attempt to fix the others prior to whodo to speed up time.

Thanks for your patience with my absence for a few months. I really needed the break. :)

No problem, happy to see you're back here.

@mamccollum
Copy link
Contributor

By the way, if you find a way to fix it: don't forget to use both "#if defined(__OpenBSD__)/#if defined(__NetBSD__) and "#if OpenBSD >= YYMM (or the equivalent to NetBSD or FreeBSD that I don't know). If don't, this may break the compatibility it has with older systems --- not like anyone cares to this sort of compatibility, but who knows if one's trying to upgrade an old Heirloom installation with this fork? I'm saying this because I saw this commit on your fork on pgrep(1) and realized that you're only checking for OpenBSD versions grander than 201205, not considering to only #if defined()' on your changes. Anyway, if you don't have time to fix this, don't worry: I can fix later when pushing in the main branch. Let's continue.

Thanks, will fix that later prior to any PRs. I'll check what was there prior to the commit and make sure that the old code is still retained for old systems.

@mamccollum
Copy link
Contributor

@takusuman I am (potentially) back again from a long hiatus of lots of things going on in my personal life. I came up with an idea, which would require a lot of effort but may be worth it. With the issues I experienced and everything I have committed so far to my fork for OpenBSD support, I believe potentially a better idea than changing all these different files and adding on would be to futureproof this by replacing a lot of these ifdefs with common functions in libcommon, to provide a universal interface for these OS-specific functions.

Again, it would require very much effort but I believe it may be worth it to prevent this kind of issue from happening again across several programs. What are your thoughts? I will likely create a local backup of my existing branch, and start fresh from the current main and use the local one as a reference for what to do. Let me know what you think.

@mamccollum
Copy link
Contributor

And -- for moving some stuff to libcommon, if that becomes the choice for this I will first move some of Linux's stuff over to it as to test if it is feasible, since Linux is our first-class citizen here. If that goes well, I will start moving other systems' capabilities to the library as well.

@takusuman
Copy link
Member Author

takusuman commented Mar 12, 2024

I am (potentially) back again from a long hiatus of lots of things going on in my personal life.

First of all, welcome back! We missed you, hope you're O.k. now.

I came up with an idea, which would require a lot of effort but may be worth it. With the issues I experienced and everything I have committed so far to my fork for OpenBSD support, I believe potentially a better idea than changing all these different files and adding on would be to futureproof this by replacing a lot of these ifdefs with common functions in libcommon, to provide a universal interface for these OS-specific functions.

Sounds great, Gunnar Ritter was portabilising Heirloom this way before stopping with the project.

Again, it would require very much effort but I believe it may be worth it to prevent this kind of issue from happening again across several programs. What are your thoughts? I will likely create a local backup of my existing branch, and start fresh from the current main and use the local one as a reference for what to do. Let me know what you think.

Well, sounds awesome. If you're going to push to the latest branch, you may wish to go to 20240220-fix, which is meant to be merged into master... sometime soon.

And -- for moving some stuff to libcommon, if that becomes the choice for this I will first move some of Linux's stuff over to it as to test if it is feasible, since Linux is our first-class citizen here. If that goes well, I will start moving other systems' capabilities to the library as well.

That made me wonder if timeout(1), which is somewhat derived from OpenBSD's implementation, will work on other systems besides modern BSD and Linux, since I've seen programs on Heirloom using sigset() instead of the sighandler_t struct. Maybe we could portabilise the timeout more per using sigset() and then aliasing it on libcommon with some #ifdefs?

@mamccollum
Copy link
Contributor

mamccollum commented Mar 12, 2024

Alright, I'll get started on that as soon as I can (sometime this week). I will try to update you on progress here throughout this process as to give myself some place to have a second opinion and input.
I came up with a nice plan for this, by the way. I am going to make a private fork of the repository to test out all changes I push to said repository via GitHub Actions. Once I find a known good change that works successfully on all the target systems, I can push it to my public copy.
GitHub actions has a nice little feature where you can install VirtualBox on the macOS builder and run any other OS inside of it. VERY useful for the BSDs and also Solaris-based systems. See here for the templates I use: https://github.com/vmactions

EDIT: Forgot to mention, I will test the code (for now, I likely will add more systems later as time goes on) on my own Debian machine, the OpenBSD action, the Ubuntu action, and the Alpine Linux action.

@takusuman
Copy link
Member Author

GitHub actions has a nice little feature where you can install VirtualBox on the macOS builder and run any other OS inside of it. VERY useful for the BSDs and also Solaris-based systems.

That's great. If you're going to use it, you can pull-request it into this main repository, these changes are very welcome.
If VirtualBox is flexible enough, even older UNIX-compatible systems and Linux distributions can be used to test if Heirloom NG can still maintain the compatibility that it predecessor had.

Forgot to mention, I will test the code (for now, I likely will add more systems later as time goes on) on my own Debian machine, the OpenBSD action, the Ubuntu action, and the Alpine Linux action.

If you're going to test it on a musl-based Linux distribution, you can try with Copacabana Linux, there's a Docker image available.
It already uses Heirloom per default, but I haven't tested newer versions yet.

By the way, if you want to, you can update the compatibility table at README.md too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants